https://s3-us-west-2.amazonaws.com/secure.notion-static.com/6e8aef81-9fd8-4e4d-842f-802c2104cf9a/Word_for_scientific_publishing_-_Microsoft_Conversations.mp3
Pablo Fernicola is a group manager at Microsoft. He runs a project focused on delivering tools and services for scientific and technical publishing, with a particular interest on the transition from print to electronic and web based content, and its implications for collaboration, search, and content discovery in the future.
In this interview, Pablo explains how a new add-in for Word, now available as a technical preview, helps authors and publishers of scientific articles work more effectively with one another, and with online archives like PubMed Central.
JU: Hi Pablo, thanks for joining us to talk about a new Word add-in for authors of scientific journal articles. It's an interesting story about applying the XML capabilities of Office, and also about the evolution of journal publishing. How did this project get started?
PF: It's an incubation project. Three people had an idea: Jean Paoli, an XML pioneer, Jim Gray...
JU: Oh really? I didn't know he had been involved.
PF: Yes, he and Jean really pushed to get this started, and they both recruited me for this project. It's been a little over a year since Jim disappeared, and that was a big blow, considering his key role.
And third key person is Tony Hey.
JU: We should explain that Tony runs what's called the technical computing initiative, and is very involved in figuring out how Microsoft can help various people in the scientific community address computing and information management challenges.
PF: Right. Scientific authors in many disciplines use Word to write articles. We looked into how to simplify the workflow, streamline the process, and lower the cost. And not just for the authors, but also for the journal publishers.
JU: It's been true for a long time in publishing, and not just scientific publishing, that there have been real challenges getting that Word content converted into the kinds of long-term formats we need: XML that's richly decorated with metadata.
Publishers have tended to use strategies that involve giving people templates that try to use styles to control what's in the document. But since Word 2003, and especially since Word 2007, there have been a set of XML capabilities which have made possible a much more robust approach.
PF: That's right. Before Word 2003, styles were the best you could do. And people got quite far by relying on them. But they were very fragile. When you copied and pasted, styles would bleed across. It was hard to disentangle that when you converted the file.
JU: That's part of the problem. And part of is that, along with the content itself, there's a process involving the metadata, and that process is divided between the author and the journal publisher. It's a shared responsibility, and you need an information management system that embraces that division of labor.
PF: Also: What kind of user interface do you present to these different groups? There are really three groups. First the authors, who are subject-matter experts but don't know anything about the publishing process, and shouldn't have to know.
Second, the journal editors. They're also subject-matter experts, but they also know about the structure of the journal, and about the metadata they need to apply
And third, you have companies and vendors who do backend tools and services, as well as the folks who work on the electronic archives. With the move from print to electronic journals, the role of the archive becomes very significant. Either the journals have their own repositories, or you have centralized repositories at university libraries or larger institutions, for example the National Library of Medicine with PubMed Central, or Cornell with Arxiv.org.
That group is very technical in terms of understanding file formats, elements, and properties.
JU: But even those folks shouldn't necessarily need to master all of that. They'd rather spend their time on math and physics, not the minutia of XML publishing.