Tuesday, January 03, 2012

Uses of wildcard types in Java

Many of the Java coding questions I get from my colleagues come down to confusion about wildcard types in Java generics. I'll be editing this post on an ongoing basis to catalog all the ways I have productively used these beasts.

Uses of wildcard types

  1. Compile-time prevention of most incremental modifications to a collection. Because the put() and add() methods of most collection abstractions receive an argument of the element type, the compiler will prevent you from even attempting to call these methods on a variable of type Collection< ? extends T > for any type or type parameter T. Unfortunately, you can still call Collection.clear(), and Collection.add( null ), without a compiler error, so the collection is not completely statically protected from modification.
  2. Compile-time enforcement of write-only collections. The converse of the above is that the get() methods don't work on a variable of type Collection< ? super T > unless the receiving type is Object, which is usually the wrong thing to do anyway. However you can happily put() and add() Ts to such a collection (and of course you can call clear()).
  3. Broader applicability of parametric library methods. If you have, for example, a utility forAll() which iterates over a collection of elements of type T (and does not use that collection in any other way), the argument type should be Iterable< ? extends T >, and not, say, List< T >, and you should probably have an Iterator< ? extends T > overload as well. If T were CharSequence, for example, you could call forAll( Arrays.asList( "a", "b", "c" ) ) or forAll( new HashSet< StringBuilder >() ) without needing extra copies of the utility logic, casts, or pointless reference copies. If your method requires a more complex nested type, you should use as many wildcards as are allowed, as in Map< ? extends K, ? extends Iterable< ? extends V > >.
  4. Broader utility of parametric library methods. Continuing the above example: because Java has no easy way to do a higher-order method like map() or filter() on collections, you will likely be writing loops in utility methods. Say we are writing a method that iterates a collection, looking for some property of the elements, and produces a new list with only the elements of the original list that had the property in question. It is better to declare the interface like
    public static < T > List< T > filter( Iterable< ? extends T > it );
    than
    public static < T > List< T > filter( Iterable< T > it );
    Why? Say we have a s of type Set< ? extends Person >. In the latter case, Java will infer the return type of filter( s ) to be List< ? extends Person >, which can't be used directly by some other method needing a List< Person >; not all APIs are considerate enough to follow the protocol in use number 3, above, which leaves them subject to quirk number 3, below. Perhaps surprisingly, in the former case, Java infers a return type of List< Person > for the call filter( s ).
  5. More flexible class implementations. Generic collaborators (such as Comparators) to whom you only supply Ts should be typed using ? super T, and collaborators from whom you only get Ts should be typed using ? extends T. Note that these collaborators might very well be invariant in their type parameters (that is, there may be methods for consuming and producing Ts available); but if you only use one sort or the other, you should still use wildcards to allow covariance or contravariance, whichever is compatible.
  6. Making up for Java's lame type inference. The following innocuous-looking code will not compile as is:
    List< List< String > > llstr = Arrays.asList( new ArrayList< String >() );
    
    The reason is that the compiler infers the type List< ArrayList< String > > for the right-hand side of the assignment, which is not assignment-compatible with List< List< String > > (see quirk number 2, below). Any of the following edits will compile, however.
    // eschew type inference
    List< List< String > > llstr = Arrays.< List< String > >asList( new ArrayList< String >() );
    // use a cast (horrors!)
    List< List< String > > llstr = Arrays.asList( ( < List< String > > )new ArrayList< String >() );
    // use a wildcard in the variable type
    List< ? extends List< String > > llstr = Arrays.asList( new ArrayList< String >() );
    

Potential use of wildcard types, if only the damn language would allow it

  1. Anonymous intersection types. Suppose you want to act on a collection of objects that implement the two interfaces I and J (the technique could work for any finite number of interfaces and optionally 1 class). Your method signature could look like
    public void doSomething( List< ? extends I & J > lij ); // won't compile
    
    except that constraints are only allowed where type parameters are declared (i.e., class C< T extends I & J > {}, interface E< T extends I & J > {}, < T extends I & J > void f() {}. You are thus reduced to giving a name to the intersection type, as in
    public < IJ extends I & J > void doSomething( List< IJ > lij );
    
    An advantage of using a name instead of a wildcard is that this technique works directly with a single argument of both interfaces:
    public < IJ extends I & J > void doSomething( IJ ij ) {
        // ...
        ij.iMethod();
        ij.jMethod();
        iConsumer( ij );
        jConsumer( ij );
        // ...
    }
    
    The same restriction applies to attempting to return an anonymous or private intersection type. There is no way to return something like ? extends I & J without giving that intersection type a name like IJ.

Quirks of wildcard types

  1. You can't construct an instance of a wildcard type using new. You can, however, use a factory method to accomplish the same thing, which makes me wonder why the designers of Java didn't just make new work. Given
    class C< T > {
        public C() {}
        public static < T > C< T > make() {
            return new C< T >();
        }
    }
    
    the following lines (using type inference) are ok:
    C< ? extends T > ct = C.make();
    C< ? super T > ct = C.make();
    but the following won't compile ("wildcard not allowed at this location"):
    C< ? extends T > ct = new C< ? extends T >();
    C< ? super T > ct = new C< ? super T >();
    C< ? super T > ct = C.< ? super T >make();
  2. Well, you can actually use new for nested wildcard types. As long as the wildcard is not at the top level, it works.
    C< C< ? extends T > > cct = new C< C< ? extends T > >();
  3. Generic types are invariant in their type parameters. In simpler terms, a C< Employee > is not a C< Person >, even if it should be by all rights (for example, if there are no methods consuming the parameter type T in C< T >). The opposite relationship also does not hold, even if it should by all rights. Instead, you get to make either relationship hold at the point where such objects are used, usually by using wildcard types. A C< Employee > is a C< ? extends Person >, and a C< Person > is a C< ? super Employee >.
  4. Wildcard types are supertypes. A C< Person > is a C< ? extends Person > and also a C< ? super Person >. No other relationships hold between the three types. It is for this reason that I advocate interfaces like uses 3 and 4, above; a method declared to expect a List< Person > is just not going to accept a List< ? extends Person > or a List< Employee >, even if it only consumed elements from the input as Persons.
  5. Type inference will not unify separate expressions of wildcard type. Given any two separate expressions, if one or both are of declared wildcard type, they will never be treated by the compiler as having the same type. This property holds even if the two expressions are identical references to a variable, or if one is created from the other by an endo-function. To "unify" the wildcard in two expressions of wildcard type, those expressions have to be declared in a generic method (a "capture-helper") which binds an actual type variable in place of the wildcard. Check out these examples:
    public class WildcardTest1 {
        private < T > void two( Class< T > t1, Class< T > t2 ) {}
        private < T > void one( Class< T > t1 ) {
            two( t1, t1 ); // compiles; no wildcards involved
        }
        private void blah() {
            two( WildcardTest1.class, WildcardTest1.class ); // compiles
            one( WildcardTest1.class );                     // compiles
    
            Class< ? extends WildcardTest1 > wc = this.getClass();
            two( wc, wc ); // won't compile! (capture#2 and capture#3)
            one( wc );     // compiles
        }
    }
    public class WildcardTest2 {
        interface Thing< T > {
            void consume( T t );
        }
        private < T > Thing< T > make( Class< T > c ) {
            return new Thing< T >() {
                @Override public void consume( T t ) {}
            };
        }
        private < T > void makeAndConsume( Object t, Class< T > c ) {
            make( c ).consume( c.cast( t ) );
        }
    
        private void blah() {
            Class< ? extends WildcardTest2 > wc = this.getClass();
            make( wc ).consume( wc.cast( this ) ); // won't compile! (capture#2 and capture#3)
            makeAndConsume( this, wc );            // compiles
        }
    }
    Incidentally, the makeAndConsume method does not compile (for the same reason) if you declare its signature as private < T > void makeAndConsume( Object t, Class< ? extends T > c ), which is a counter-example to use #4, above. Capture-helper methods must not have wildcard types in their parameters.
  6. The implicit existential is on the nearest class. What I mean here is that anytime there is a wildcard buried deep inside a parameterized type, such as F< G< H< ? extends T > > >, the existential binds the wildcard as closely as possible; something like F< G< (∃ S extends T) H< S > > >, as opposed to, say, (∃ S extends T) F< G< H< S > > >. In practice, this means that G is heterogeneous; if G were a collection type, for example, its elements could be a mixture of H< S1 > and H< S2 > for two different subtypes S1, S2 of T, and still be type-correct.

1 comment:

Anonymous said...

thanks, very good =)