Skip to main content

Reflecting on reflection in Scala and Java

Reflection has always been a bit of a trickery, and while there may be some true to it, it does not mean one cannot master mind some cool stuff using it, and by cool I mean useful,  re-usable, elegant and the likes. In this post we put on the wizard hat and pull out some rabbits.

Problem Statement

In this post we'll discuss 3 use-cases in which we want to retrieve runtime information for some generic parameters. In all three of the following use-cases, our goal is to retrieve the runtime information for the generic type parameter T.

The first use case is when you have a class deriving from a generic class, without having generic parameters of its own:
class GenericInt extends Generic<T> {}
// one wants to know the type of T of its parent
new GenericInt()

The second use case is when you create a new instance of a generic class by providing a generic parameter:
// one wants to know the type of T inside the class
new GenericClass<Integer>()

The third use case is when you invoke a generic method and provide it with a generic parameters:
// one wants to know the type of T inside the method
myInstance.<Integer>foo()



What about Java

Back in my C# days, it was really cool you could investigate your generic types at runtime, which meant you could do:
// this will work inside a class that has a generic type parameter
// T in its declaration
Type typeParameterType = typeof(T);

In Java however, that is not possible due to the fact that when you do:
// good luck figuring our what T is inside Foo...
Foo<Integer> foo = new Foo<Integer>();
the information about T being an Integer is lost at runtime  a.k.a type erasure. It can be rather unfortunate if one was counting on applying some type specific logic in his code, that is, if one was planning on executing different logic depending on the type of the generic parameter at hand, e.g., if T is Integer, do this, if T is MyClass, do that. That will simply not work, a.k.a bummer.

It's important to point out a great source for confusion here, as nicely put by this stackoverflow answer, keep in mind that declared type info, as opposed to runtime type info, IS available at runtime, i.e.,:
// here you CAN figure out at runtime that GenericInt derives from Generic<T>, 
// and T is Integer, since you have a compile time  type that captures this 
public class GenericInt extends Generic<Integer> {}
The generic parameter's type can be extracted like so:
ParameterizedType genericSuperclass = 
  (ParameterizedType) GenericInt.class.getGenericSuperclass();
Class genericParam = (Class) genericSuperclass.getActualTypeArguments()[0];

When dealing with use-case (1) some try to expose the fixed generic types via designated getter methods, e.g.:
class GenericInt extends Generic<Integer> {
  public Class<Integer> getType() { return Integer.class; }
}

while this may help getting the generic types' info, it is hardly a good solution as it pollutes the class with a bunch of per-generic-type methods, that need not be present. The same information can be extracted by properly using reflection.

So, to sum up what we've discussed so far, in Java, runtime info for generic types is unavailable, but, declared type info is. Getting back to the use-cases above, (1) is supported, while (2) and (3) are not.

What about Scala

Scala introduced TypeTags and Manifests, which facilitate use-cases (2) and (3):
def runtimeClassOf[T: ClassTag]: Class[T] = {
    classTag[T].runtimeClass.asInstanceOf[Class[T]]
}

class Generic[T : ClassTag] {
  println(runtimeClassOf[T].getName)
}

// will print "java.lang.Integer", sweet!
new Generic[Integer]

as per use-case (1), it would be tempting to use the same code we saw for Java:
class GenericInt extends Generic[Int] {
  val genericSuperclass = 
    classOf[GenericInt]
    .getGenericSuperclass
    .asInstanceOf[(ParameterizedType)]
  val genericParam = genericSuperclass.getActualTypeArguments.apply(0)
  println(genericParam) 
}

new GenericInt // prints out "class java.lang.Object", hmm...?

But it does not work, and instead of printing Int, prints Object. This brings us to this question on stackoverflow, and based on the accepted answer we come up with the following:
class GenericInt extends Generic[Int] {

  import scala.reflect.runtime.universe._;

  val rm = runtimeMirror(getClass.getClassLoader)
  val derivedSym = rm.staticClass(classOf[GenericInt].getName)
  val baseSym = rm.staticClass(classOf[Generic[_]].getName)
  val TypeRef(_, _, params) = derivedSym.typeSignature.baseType(baseSym)
  val genericParamClass = Class.forName(params.head.typeSymbol.fullName)
  println(genericParamClass)
}

new GenericInt  // prints out "class scala.Int"

Which gives us a way to figure out the generic types' info in use-case (1).

Come to think of it, unlike use-cases (2) and (3),  in (1) the generic types are already known at compile time. Wouldn't it be nice if we could leverage it somehow?

Using a technique known as the visitor pattern, we can:
abstract class Generic[T: ClassTag] {
  def visit(visitor: GenericVisitor)
}

class GenericInt extends Generic[Int] {
  override def visit(visitor: GenericVisitor): Unit = {
    visitor.visit(this)
  }
}

class GenericVisitor {
  private def runtimeClassOf[T: ClassTag]: Class[T] = {
    classTag[T].runtimeClass.asInstanceOf[Class[T]]
  }
  def visit[T: ClassTag](friend: Generic[T]): Unit = {
    // welcome to type-safe land, T is Int, code goes here
    println(runtimeClassOf[T].getCanonicalName) // prints out "int"
  }
}

// the whole thing in action: 
val instance = 
  Class.forName(classOf[GenericInt].getCanonicalName)
  .newInstance().asInstanceOf[Generic[_]]

instance.visit(new GenericVisitor)

The thing to note here, is that we start off with only the class' name "classOf[GenericInt].getCanonicalName" (in a string from), and by using the visitor pattern "instance.visit(new GenericVisitor)", we bind its generic parameters in a way that is type-safe at compile time, such that inside the visitor's visit method we're type safe, and have information about the generic parameter available to us - that's a long way to go from a string!

For the final trick, let's take things even further and invoke the visit method itself using reflection, and thus eliminate the need to have it present in the visited class definition.
def tryFindVisitMethod(inVisitor: Class[_ <: GenericVisitor],
                       visitedClass: Class[_ <: Generic[_]],
                       visitMethodName: String): Try[Method] = {
      Try(inVisitor
            .getMethods
            .filter(_.getName == visitMethodName)
            .find(_.getParameterTypes.head.isAssignableFrom(visitedClass))
            .get) recoverWith {
        case NonFatal(e) =>
          Failure(new RuntimeException(s"Could not find a suitable method: " +
                                         s"[$visitMethodName] that accepts " +
                                         s"[${visitedClass.getName}]",
                                       e))
      }
    }

    def tryBuildClassTags(classNameToVisit: String): Try[List[ClassTag[_]]] = {
      Try {
        import scala.reflect.runtime.universe._;
        val runtime = runtimeMirror(this.getClass.getClassLoader)
        val ClassInfoType(parents, _, _) =
          runtime.staticClass(classNameToVisit).typeSignature
        val Some(TypeRef(_, _, genericParams)) =
          parents.find(parent =>
                         classOf[Generic[_]]
                           .isAssignableFrom(Class.forName(parent.typeSymbol.fullName)))
        genericParams.map(runtime.runtimeClass).map(ClassTag(_))

      } recoverWith {
        case NonFatal(e) =>
          Failure(new RuntimeException(s"Could not produce class tags for generic " +
                                         s"parameters of $classNameToVisit", e))
      }

    }

    def visit[T <: Generic[_]](visitMethod: Method,
                               instance: T,
                               classTags: List[ClassTag[_]]): Unit = {
      import JavaConverters._;
      visitMethod.invoke(new GenericVisitor,
                         (instance :: classTags).asJava.toArray: _*)
    }

And finally:
// let's put it all together
    val classNameToVisit = classOf[GenericInt].getName
    val classToVisit =
      Class.forName(classNameToVisit).asInstanceOf[Class[_ <: Generic[_]]]
    val instanceToVisit = classToVisit.newInstance()
    
    {
      for {
        visitMethod <- tryFindVisitMethod(classOf[GenericVisitor],
                                          classToVisit,
                                          "visit")
        classTags <- tryBuildClassTags(classNameToVisit)
      } yield {
        // effectively: "(new GenericVisitor).visit(new GenericInt)"
        // which eliminates the need for a "visit" method in GenericInt
        visit(visitMethod, instanceToVisit, classTags)
      }
    }.get

What about us

To sum things up, in this post we've discussed three cases where runtime information is required for generic type parameters. Some of these uses cases are not well supported in Java, while Scala provides tools to deal with all of them.
We showed how these use-cases can be dealt with in Scala, and showed how the visitor pattern can be helpful for this purpose. Finally, we also saw how the visitor pattern can be implemented using reflection, which allows one to remove the explicit presence of the "visit" method in visited classes.

Reflection is by no means trivial, let alone in Scala, but given proper usage it can be a very powerful tool.

Many thanks to the authors of the highly informative questions and answers in stackoverflow, referenced in this post. Not having all this info available would have made writing this post so much harder.

Edit (13/03/2016):
Fixed the ClassTag building method, which had a bug that was caused by doing "genericParams.map(param => ClassTag(Class.forName(param.typeSymbol.fullName)))" instead of "genericParameterTypes.map(runtime.runtimeClass).map(ClassTag(_))" and thus losing some important type info.

Comments

Popular posts from this blog

Sending out Storm metrics

There are a few posts talking about Storm's metrics mechanism, among which you can find Michael Noll's postJason Trost's post and the storm-metrics-statsd github project, and last but not least (or is it?)  Storm's documentation.

While all of the above provide a decent amount of information, and one is definitely encouraged to read them all before proceeding, it feels like in order to get the full picture one needs to combine them all, and even then a few bits and pieces are left missing. It is these missing bits I'll be rambling about in this post.

Dependency Injection - The good, the bad and the ugly

The Good
Dependency injection (DI, a.k.a IoC - inversion of control) is a well known technique to increase software modularity by reducing coupling between modules. To provide the benefits of DI, numerous DI frameworks have arisen (Spring, Guice, Castle Windsor, etc.) all of which essentially give you "DI capabilities" right out of the box (these frameworks tend to provide a whole lot more than just "DI capabilities", but that's not really relevant to the point I'm about to make). Now, to remove the quotes around "DI capabilities", let's define it as a DI container - a sack of objects you can manipulate using a provided API in order to wire these objects together into an object graph that makes up your application.

I've worked on quite a few projects employing Spring, so it will be my framework of reference throughout the rest of the post, but the principles and morals apply just the same.