Sunday, June 26, 2005 - 12:42

Serialization in .NET

Serialization in .NET

As an aside, before I start talking about what I've learned about serialization in the last week, let me say that it really seems as though fate, destiny, or the religious entity of your choice seriously wants me to go to London - Saturday morning my fridge just stopped working (in fact, even worse, it started heating everything up instead of cooling it down). Coming on top of everything else, it just seems like a very distinct sign :-) And believe me, I really would love to just pack and go, but I can't do that just yet.

Anyway, on to serialization. I've kinda used it in the past, but not directly, and I was amazed to find out just how useful it can be, especially when paired with reflection.

First, two small things that are really quite obvious but I never really realized until a week or two ago. One: you don't need to implement ISerializable, you can just mark your class with [Serializable]. I'm not 100% sure why you would want to implement ISerializable; I guess maybe if you wanted to serialize it in a very specific way. If you use the Serializable attribute, all your public fields and properties will be serialized, unless you mark them as [NonSerialized]. And if you want to set any private properties manually on Deserialization, you can implement the IDeserializationCalback interface. One word of warning: classes that use the serializable attribute will return false to 'obj is ISerializable'; use obj.GetType().IsSerializable instead.

Two: saving objects to file is really, really easy. All you need to do is create a BinaryFormatter (or XMLFormatter, or whatever you want), use it to serialize the object to a stream, and then save the stream. To retrieve the object, load the stream from the file and deserialize it into the object. The only catch is that to deserialize, you need to know what kind of formatter was used to serialize the object, and use that same formatter type to deserialize it.

So those are pretty simple, and I should have figured them out ages ago. But I came across two more things, which are more complex (but not difficult to code), and are really cool.

The first one is serialization surrogates. What happens when you need to serialize an object that isn't marked as serializable and doesn't implement ISerializable, and you don't have access to the code defining the class? My first instinct was to create a wrapper class, copy all the info across, and make my wrapper class serializable. But there is an easier way: implementing ISerializationSurrogate. I don't truly understand the details of how it works (see the reference at the end for more detail), but basically what you do is create a class which implements ISerializationSurrogate and contains just two methods: SetObjectData(...) and GetObjectData(...). Each method contains two parameters, amongst others: a SerializationInfo object, and the object to serialize/deserialize. In GetObjectData() you copy the info from the object you want to serialize into the SerializationInfo object, and vice versa for deserialization using SetObjectData. The object passed to SetObjectData is initialized, in the sense that memory has been allocated for it, but no constructors have been called on it. In theory this is supposed to work fine; in practice, I found that the first thing to do is create a new object of the correct type using a constructor, otherwise you end up with null reference exceptions. Then, despite what you may see on the net and in the articles referenced below, you must return this object that you've created. I found it easiest to just use reflection to get all the public, readable, writeable, serializable properties, and iterate through those to add them to or set them from the SerializationInfo parameter.

Finally all you need to do it set it up so that your surrogate gets called instead of the normal serialization method:

IFormatter formatter = new BinaryFormatter();
SurrogateSelector ss = new SurrogateSelector();
ss.AddSurrogate(typeof(Message),
new StreamingContext(StreamingContextStates.All),
new MessageSerializationSurrogate());
formatter.SurrogateSelector = ss;

And that's it. You can do more funky stuff here if you really want to, but you can ignore the details if you prefer.

Of course, this is a recursive process - unless it's a very simple class, you'll probably find that it contains non-serializable objects itself. If it does, and if you really do need them to be serialized as well, you'll need to write serialization surrogates for them - and it can be an almost never-ending process, similar to making a deep clone. In fact, you can use serialization to make deep clones - serialize the object to a stream, then deserialize the stream into a new object, Slightly more effort than the copy constructor method of deep cloning objects, but if you have the serialization methods in place anyway, I guess you might as well use them.

Finally, the last serialization technique that I discovered this week is the ResolveEventHandler, and it is really cool. Basically, what happens if you try to deserialize an object that was serialized by a different app, and you don't have the object's type defined anywhere in your assemblies? You simply set up a ReolveEventHandler, and during the deserialization process, if the type cannot be found in any of the app assemblies or any assemblies located in the current directory, your event handler will be called, giving the type that needs to be located. You can then find the assembly (maybe by asking the user for a path), load it using Assembly.LoadFrom(), and return the loaded assembly to the calling method. The serialization process will then continue, and since the assembly is now loaded in your app domain, the next time an object of that type needs to be deserialized, the type will be defined and your event handler will not be called. It's recommended that you register your event handler just before you attempt the possibly problematic deserialization, and then deregister it immediately afterwards.

There is, of course, a lot of info on all this on the net, and on msdn. But the reference I found most useful, with really practical (almost copy & paste) examples were a set of 3 articles for MSDN Magazine. (Note that the section on SetObjectData() is out of date: the return value is no longer ignored, and the object that you've just set the data for *must* be returned).

Now, go forth and serialize!

Labels:

2 Comments:

At 5/7/05 09:57, Blogger CJ said...

Just a correction on one point I made about the ResolveEventHandler - if you load the dll using LoadFrom, it doesn't quite get loaded into your appdomain, as such - any attempts to access any of the types in the dll will trigger the ResolveEventHandler again.

 
At 24/10/05 04:41, Anonymous Anonymous said...

This comment has been removed by a blog administrator.

 

Post a Comment

<< Home