The yellow covers of the “For Dummies” series are hard to miss. The series published by Wiley has staked its claim in the how-to genre by offering accessible, digestible advice on everything from raising a hermit crab to origami to drones. With over 200 million books in print and more than 1,600 titles, they’ve satisfied quite a few readers’ information appetites. In a way, the “For Dummies” series has become a physical manifestation of the Internet itself.
Curiosity Creating Content
Hear me out. The impulse that created the “big data” reality of the Internet is pretty similar to the driving force behind the constantly expanding series: our need to know more about everything. Like the “Dummies” books serve up a buffet of diverse and digestible content, the internet satisfies our desire for knowledge by finding and delivering the exact content we’re looking for within seconds (or minutes depending on your internet provider). Keeping up with our curiosity has required larger and larger datasets for the Internet and for the “Dummies” publisher, Wiley. Wiley has spent most of its more than two-hundred year existence (nope, that’s not a typo) accruing information to the tune of 6 million articles from 2,000 journals, 13,000 books, and 200,000 other reference articles, all of which are supported by 6 million bibliographic references.
Yeah, that’s a lot of data. But Wiley Publishing is no dummy (c’mon, you knew that one was coming). Historically, Wiley gathered content from various systems, reframed it to fit the “For Dummies” formula, and packaged the information in physical books. But the emergence of the Internet spurred a move in the digital direction. But making that digital leap took some serious data wizardry.
Dummies Goes Digital
In 2006, Wiley brought in Freddie Quek as their Director of Strategic Initiatives. “With ‘big data,’ comes lots of content,” Quek explains. “You need that content not only to be accessible but also useable.” Quek knew that turning Wiley’s massive data pool into usable, accessible content would require a secret weapon: MarkLogic. The pros at MarkLogic decided that handling Wiley’s data would require a unique solution. Quek was confident in their ability to design just that: “With MarkLogic, you can do anything,” he says.
Using custom technology, Wiley now operates an expansive online information resource. The Wiley Online Library provides access to more than 18 million documents, including 6 million articles from 2,000 journals, 13,000 books, and 200,000 other reference articles, all supported by 6 million bibliographic references. Now, 60 percent of Wiley’s revenue comes from its digital ventures. The Online Library, with over 18 million documents and 15,000 books, now gets 65 million page views per month.
Two Data Pools: One Online Library
But the true test of MarkLogic’s data management abilities came in 2012 when Wiley partnered with the American Geophysical Union to integrate AGU’s entire body of information (160,000 articles from nearly 800 data sources) into the Wiley Online Library. The biggest challenge they faced was timing. In order to create revenue, all this information needed to be fed into the Wiley Online Library, and they had four months to do it. But Quek was confident. “You can throw any content at MarkLogic, and you get unified access.”
MarkLogic had clearly proven itself to be a worthy partner. Now, they were able to repurpose the same XML store and content modeling used to build the Wiley Library and apply it to the AGU data pool. Armed with this tested approach, developers were able to package and deliver content on all sorts of devices, in any desired format. The XML format also allowed Wiley to add new markup to support complicated queries, transfer content, and accommodate new content as it arrived.
Quick and Not So Dirty
They had the tools, but meeting the deadline of four months would require rapid development. MarkLogic was ready with troubleshooting, auditing and reporting capabilities, all of which helped Wiley meet project milestones, recognize revenue, improve content quality, and identify bottlenecks. As if that wasn’t enough, they also added built in search with native indexing to improve content load time.
The speed and efficacy of the program delivered AGU’s content to more than 60,000 users, and earned the Wiley team a slew of awards, including the President Award for Excellence, IT Project Team of the Year award at the UK IT Industry Awards, and MarkLogic’s Customer Excellence Award.
Thanks to MarkLogic, Wiley can continue to work with partners like AGU to provide an immense treasure-trove of knowledge to curious non-dummies everywhere.
Jessica Ferri is a writer based in Brooklyn. You can find her at jessicaferri.com.