Evernote Tech Blog

The Care and Feeding of Elephants

Fast String Handling: A Frayed Knot

Evernote’s servers process a lot of data for our users. At any given time, a shard may be performing different activities for different clients. For example:

  • Constructing dynamic web pages for user accounts
  • Performing API calls on notebooks, tags, etc.
  • Uploading or downloading images
  • Uploading or downloading other files (audio, PDF, etc.)
  • Clipping web pages from remote sites
  • Managing image/PDF text recognition
  • Indexing notes (including search data and PDF contents) into Lucene indices
  • Performing searches against Lucene
  • Rendering “thumbnail” images for notes, images, PDFs
  • Handling new or recurring payment processing for PayPal, Google Checkout, CyberSource, iTunes

This heterogeneous mix of tasks and data means that our application can be particularly sensitive to concurrency bottlenecks in both our code and third-party libraries.  While Evernote activity isn’t particularly “bursty” compared to some web services, the daily variation across our 95 shards means that even infrequent chokepoints will hit some shards from time to time.

When our monitoring systems detect that a particular shard is underperforming, we try to capture as much information as possible about the current state of the server without introducing more problems. One low-tech tool is “sudo killall -3 java”, which dumps the current stack trace for every Java thread to standard output. We can then inspect the state of each thread for signs of problems.  Here’s a fun example of the sort of bottleneck we find by inspecting enough stack dumps:

On regular occasions, we’d find a number of threads in a choking server all waiting to convert a byte[] to a String using a named encoding or vice versa. We’d find blocked threads originating in code from Tomcat, MySQL Connector/J, GWT, SAX, Thrift, and the JRE itself. The threads would all look something like this:

java.lang.Thread.State: BLOCKED (on object monitor)
       at sun.nio.cs.FastCharsetProvider.charsetForName(Unknown Source)
       - waiting to lock <0x00007f4c3d48acb0> (a sun.nio.cs.StandardCharsets)
       at java.nio.charset.Charset.lookup2(Unknown Source)
       at java.nio.charset.Charset.lookup(Unknown Source)
       at java.nio.charset.Charset.isSupported(Unknown Source)
       at java.lang.StringCoding.lookupCharset(Unknown Source)
       at java.lang.StringCoding.encode(Unknown Source)
       at java.lang.String.getBytes(Unknown Source)
       at com.mysql.jdbc.StringUtils.getBytes(StringUtils.java:499)
       ...

After reading the JRE code, we found that the concurrency bottleneck is caused by a simple synchronization block in the [ironically-named?] FastCharsetProvider.charsetForName method that looks up a cached Charset for a String name (like “UTF-8″).  The use of Java’s ‘synchronized’ call to protect this in-memory cache prevents two threads from breaking the cache data structures, but means only one thread can look in the cache at a time.

There’s at least one Java RFE filed to improve this bottleneck.  As suggested by Paul Linder, the modern ConcurrentHashMap collection provides a better alternative to fully synchronized classic Maps for caching.

But we don’t really have the luxury to wait for a full JRE fix, so we have to reduce the impact of this bottleneck ourselves via things like:

  • Patch Tomcat
  • Patch the GWT parser
  • Suggest fixes for MySQL Connector/J
  • Replace all relevant byte[]<->String transformations across our own codebase. (Including such unpleasantness as removing all use of JRE classes URLEncoder/URLDecoder with their own unpatchable String encodings.)

Short version:  Large-scale concurrency is kind of hard.  Java’s ConcurrentHashMap is super awesome for in-memory caching.

 

 

One Comment

  1. We are experiencing the same server issues as you guys with FastCharsetProvider. For us, it is mainly from Cassandra’s client library Hector/Thrift, and the MySQL Connector. I am surprised I don’t see more uproar about this.

    I think that patching these libraries might be a bit of a red herring though, in the sense that possibly the problem is way too much contention. The solution being more servers, less threads that take longer to process requests. It is so easy to let ExecutorService’s get out of control and assume that more threads == better.

    For us I have a feeling that removing the charset contention is going to point to massive spikes in the number of threads we are using.


Leave a Comment

* Required fields