NSPropertyListSerialization vs NSKeyedArchiver performance

Discussion:

Stefan Johansson

22 years ago

Hi all,

just "for fun" I did some serialization performance tests.
Currently I'm using NSKeyedArchiver coupled with NSKeyedUnarchiver to
serialize/deserialize dictionaries and send them over a socket.
Today I tried the same, but with NSPropertyListSerialization instead
(using the methods dataFromPropertyList and propertyListFromData.
I serialized/deserialized 10000 dictionaries with both metods and the
execution time difference was staggering.

NSKeyedArchiver took about 8 seconds to serialize 10000 dictionaries.
NSPropertyListSerialization took less than 1.5 seconds!

I don't know the underlying mechanisms for NSPropertyListSerialization, so
I can't figure out if this is correct or what.
If it is, I'm going to switch over to NSPropertyListSerialization
immediately.

What I do to send it over the socket is basically write([nsData bytes]...)
and then create a NSData object on the other side with the recieved data
and deserialize it.

Any caveeats or the like when using NSPropertyListSerialization instead of
NSKeyedArchiver/NSKeyedUnArchiver in this manner?

Thanks in advance.

Cheers,
Stefan

This message has been scanned by F-Secure Anti-Virus for Microsoft
Exchange.

Confidentiality Notice:
The information contained in this message may be legally privileged and/or
confidential information. It is intended only for the recipient(s) named
above.
If the reader of this message is not an intended recipient, you are hereby
notified that any use, dissemination, distribution or copying of this
message is strictly prohibited.
If you have received this message in error, please notify us immediately
at once by replying to the sender and delete the original message.

Chris Kane

22 years ago

Permalink

That's not very surprising. NSKeyedArchiver is a general object graph
archiver. It has to support things like multiple references to the
same object, object replacement, calling each object's encodeWithCoder:
method, object graphs, and so on. This archiver also goes to extra
effort to make the output smaller than it would otherwise be (when
doing binary output), which consumes some time (though I think
compacting dictionaries was explicitly skipped, as it was found to be
too expensive) through some additional uniquing at the end of
archiving.

NSPropertyListSerialization only has to deal with the plist object
types, isn't a general archiver, only has to deal with object trees,
can know how to serialize each object type itself, and doesn't have to
preserve object reference patterns.

If you only have plists to send, by all means use the simplest
mechanism (NSPropertyListSerialization) for the job.

Chris Kane
Cocoa Frameworks, Apple

...

matt neuburg

22 years ago

Permalink

Post by Chris Kane

Post by Stefan Johansson
NSKeyedArchiver took about 8 seconds to serialize 10000 dictionaries.
NSPropertyListSerialization took less than 1.5 seconds!

That's not very surprising.

No, but it's sure annoying. I recently switched my
document-based app's doc file format from keyed archiving to
a simple property list, because with keyed archiving it was
taking 30 seconds of spinning-cursor-of-death to save a
typical document. With a property list, it takes a fraction
of a second.

The advantage of using keyed archiving is that it's so easy
and elegant, and therefore tempting, to program. I simply
said this:

- (NSData *)dataRepresentationOfType:(NSString *)aType
{
return [NSKeyedArchiver archivedDataWithRootObject: self->theData];
}

The class of theData knew how to encode its various pieces,
and each of those pieces knew how to encode itself, and
everything was fine. With a property list, on the other
hand, I have to jump through all sorts of hoops to force my
data into the kinds of formats that a property list knows
about (arrays and dictionaries and strings and suchlike).
And then I have to jump through the converse hoops to open a
file and turn it back into my data structures. However, the
user experience was simply unacceptable when using keyed
archiving. Explaining to my users that this approach was
elegant behind the scenes just didn't cut it, somehow. m.

PS Opening a keyed-archived file was very fast; it was just
saving that was so slow.

--------
matt neuburg, phd = ***@tidbits.com, http://www.tidbits.com/matt/
pantes anthropoi tou eidenai oregontai phusei
Subscribe to TidBITS! It's free and smart. http://www.tidbits.com/

j o a r

22 years ago

Permalink

...but it's not really fair to compare the two - apples and oranges and
all that.
Have you compared the new keyed archiver to the old archiver
(NSArchiver / NSUnarchiver)? How do they stack up?

j o a r

...

Chris Kane

22 years ago

Permalink

Post by matt neuburg
No, but it's sure annoying. I recently switched my
document-based app's doc file format from keyed archiving to
a simple property list, because with keyed archiving it was
taking 30 seconds of spinning-cursor-of-death to save a
typical document. With a property list, it takes a fraction
of a second.

(1) can you boil things down to a stand-alone test case that can be
sent to Apple? (The keyed archiver variation.) We'd be interested in
a case like this with such an extreme problem.

(2) or alternatively, have you sampled and/or run other performance
tools on the document-saving aspect of your application? Where is the
time spent? Is lots of autorelease or general allocation activity
going on?

NSKeyedArchiver isn't self-contained, it calls out to the individual
objects to do their encoding, and if they are slow, the whole process
will be slow, for example. If it's one of the Cocoa classes for
example, only Apple can potentially fix that.

Speed will also depend on the number of objects being encoded. ...
which depends on how your data is structured, too, of course.
encodeBool:forKey: is much faster and produces a smaller result in the
output file than encodeObject:forKey: with an NSNumber object
containing a boolean.

Post by matt neuburg
PS Opening a keyed-archived file was very fast; it was just
saving that was so slow.

The archiving/unarchiving process is optimized for the reading case, at
the expense of the creating case, in both the NSArchiver and
NSKeyedArchiver cases. This is because reading is likely to happen
more times than writing. Not in the case of the user pumping cmd-s to
save a document, of course, but in the case of .nib files they get read
many orders of magnitude more times than they're written.

Chris Kane
Cocoa Frameworks, Apple

matt neuburg

22 years ago

Permalink

Post by Chris Kane

(1) can you boil things down to a stand-alone test case that can be
sent to Apple? (The keyed archiver variation.) We'd be interested in
a case like this with such an extreme problem.

...

I certainly could - all I have to do is send you my app and one of its
documents. The reason this is so easy is that just as I was about to change
from keyed archiving to plists, some guy at Apple :) posted a note
suggesting that any such change be implemented through addition of a
document type, which seemed a great idea to me. Thus, to see the problem,
you just do a Save As and save the document in the desired format.

Post by Chris Kane
(2) or alternatively, have you sampled and/or run other performance
tools on the document-saving aspect of your application? Where is the
time spent? Is lots of autorelease or general allocation activity
going on?

I never quite figured this out, mostly because everything happens behind
the scenes where I can't get at it easily. All my code does, as I mentioned
before, is say

return [NSKeyedArchiver archivedDataWithRootObject: parsedData];

This parsedData is of a class called Messages, which naturally implements
encodeWithCoder:

- (void) encodeWithCoder:(NSCoder *)coder {
[coder encodeObject: theMessages forKey: @"theMessages"];
[coder encodeObject: theReplyAddress forKey: @"theReplyAddress"];
}

Of these two objects, only theMessages is interesting; it's an NSArray of
NSDictionaries of NSArrays of a class called Message, which also implements
encodeWithCode:

- (void)encodeWithCoder:(NSCoder *)coder {
[coder encodeObject: sender forKey: @"sender"];
[coder encodeObject: subject forKey: @"subject"];
[coder encodeObject: text forKey: @"text"];
[coder encodeBool: read forKey: @"read"];
[coder encodeBool: keep forKey: @"keep"];
[coder encodeObject: trueDate forKey: @"trueDate"];
[coder encodeObject: rawDate forKey: @"rawDate"];
}

That's all the code there is, which is precisely why I liked the idea of
using keyed archiving; it's so simple, elegant and encapsulated.
Unfortunately, as I said, there turned out to be another price, speed, and
in the end I had to make a trade-off in favor of faster saving.

Post by Chris Kane
NSKeyedArchiver isn't self-contained, it calls out to the individual
objects to do their encoding, and if they are slow, the whole process
will be slow, for example. If it's one of the Cocoa classes for
example, only Apple can potentially fix that.
Speed will also depend on the number of objects being encoded. ...
which depends on how your data is structured, too, of course.
encodeBool:forKey: is much faster and produces a smaller result in the
output file than encodeObject:forKey: with an NSNumber object
containing a boolean.

...

Well, I tried putting in a notification so that I could display the
progress as each Message was archived, and also I thought that this would
make the user feel better about the wait. But it didn't help. It turned out
that the processing of each Message was quite fast; the progressbar shot
from empty to full quite speedily. But *afterwards* - that is to say,
*after* every Message was processed - there was still a lengthy delay
before the File menu unhighlighted and the close button changed from dirty
to non-dirty, and since there was no way I could account for this
post-processing delay, which seems to be built into the inner workings of
keyed archiving itself, I ended up casting about for other ways for save
the data.

m.

--------
matt neuburg, phd = ***@tidbits.com, http://www.tidbits.com/matt/
pantes anthropoi tou eidenai oregontai phusei
Subscribe to TidBITS! It's free and smart. http://www.tidbits.com/