Bad Arolsen Media Analysis
|Edwin Black||January 21st 2008|
When probing the Holocaust, the horrific experiences of survivors, the listener melts. We all melt at the enormity of the horror. Tattoos always trump the arcane questions of technology. But professionals who study the Holocaust beyond the blood and bones of mass murder know information technology was an indispensable behind-the-scenes factor in the original crime. Seventy-five years after Adolf Hitler came to power, information technology is again an indispensable behind-the-scenes factor, this time in exposing the crime.
This brings the Holocaust community to the continuing controversy over providing survivors remote secure access terminals to the Bad Arolsen archives instead of making them travel to the United States Holocaust Memorial Museum in Washington, D.C. to obtain details of their incarceration and enslavement. The USHMM is refusing to share access with other Holocaust institutions and now claims it will begin “individualized research” for the estimated 150,000 survivors in America and perhaps many among a million worldwide, all of whom want answers today not tomorrow, and do so with an initial staff of 24 trained researchers. The Museum refuses to budge and is increasingly defensive on the persistent demands of Holocaust survivors and media inquiries about a seemingly obvious question.
During and after the January 17, 2008 USHMM press conference on the topic of the Bad Arolsen archival transfers, Museum executive director Sara Bloomfield made statements about the archival technology to the Jewish Telegraphic Agency (JTA), the 85-year-old, war-tested Jewish communal news service, known for its precision as well as its diligent, respected correspondents. Bloomfield’s remarks to the JTA reporter about why the Bad Arolsen files could not shared, it seems, amounted to a calculated misinformation effort to pretend such sharing was impossible. The opposite is true. Her remarks were implausible on their face, and completely contrary to the published facts.
The JTA report stated: Much of the material delivered to the museums on hard drives packed into suitcases is not yet digitally searchable; images of the documents and 50 million index cards that arrived between August and November of last year are in jpeg form. Converting those images to searchable files will take much time and millions of dollars, officials of the U.S. Holocaust museum said at a news conference last Thursday morning, before the meeting with survivor groups. "To make it machine-readable would take millions and millions," said Sara Bloomfield, the museum's director. "We don't have the time.” Instead, said Michael Haley Goldman, the director of the museum registry, the priority would be to answer survivor questions, with trained staffers searching through the material.
That raises the obvious question: if the files are not “searchable,” what will the trained staffers search? What has the staff at Bad Arolsen been searching for years? Answer: they, of course, won’t search jpegs of documents, because jpeg (Joint Picture Experts Group files) are mere picture images of documents which are not easily translated into raw text. Instead they will search the databases common to virtually all image management systems used by banks, historical archives and government repositories. All of Bad Arolsen’s jpegs are in fact indexed in some way in a relational database.
By typing key words such as name, birth city and birthdate, the database will provide candidates images of documents. Trial and error will narrow the jpeg documents eliminating those of similar name or circumstance until the right person is matched. This elimination can take moments or perhaps hours, depending upon the recollection and details punched into the database.
The exact details of the database systems were published as a Cutting Edge News exclusive report in August, 2007, after an exhaustive month-long international effort, including obtaining written descriptions from the technology chief at Bad Arolsen about his own systems, and consultation with numerous computer resource experts. An August, 2007, Bad Arolsen written summary of its two data systems, obtained by Cutting Edge News follows:
CNI (Central Name Index)
The ITS Central Name index was originally a paper index file, sorted in an alphabetical-phonetical way that matches the requirements of the ITS work. It was built (in paper) from the early 1950´s to 1998 and contains in its physical form 42 million cards related to 17 million identities. Between 1998 and 2000 the complete CNI was digitized. To maintain the operational status of the Organisation during the 2 year scanning-work and the following digitisation process, the CNI was digitized in the same order than the paper file. An additional 8 million pieces of information and 0.5 million identities were added to the digital CNI since the year 2000. Due to the absence of a standardized format and because most of the original cards were hand-or typewriter written, only 8-10% of the images could be OCR´ed [Optical Character Read].
To search the CNI database, the operator types a name, a surname and a date of birth into the GUI-front end [Graphical User Interface]. He will be led to the first possible match for the name, surname and date of birth. From this point he can leaf manually through the images that lie physically “behind” the first image.
Search: Rosenbaum, Edwin, 12.03.1912 Result: Rosenbaum, Edgar, 03.03.1898 (Image w/metadata) (manually) Leaf one forward: Rosenbaum, Edgar, 08.12.1911 (only image) (manually) Leaf one forward: Rosenbaum, Eduard, 01.10.1899 (only image) (manually) Leaf one forward: Rosenbaum, Eduard, 02.01.1911 (only image) (manually) Leaf one forward: Rosenbaum, Edwin, 12.03.1912 (only image)
SIMS (Simple Image Management System)
The SIMS Database was developed after 2000 to digitize the physical documents located at the archives. For this, a different approach was chosen. The ITS search scanned original documents by name, surname and date of birth too, but they are ordered according to archival units and without the alphabetical-phonetical system.
The Incarceration Collection of the archives has been digitised and stands ready for export by the end of the month of August 2007. The majority of the archival units contained therein, like “CC Buchenwald Men”, “CC Dachau” etc. are fully indexed. Some other units (that are rarely used by the ITS) are 4% indexed (every 25th image). Lists are, at this point, not name-indexed.
Thus, while the CNI database is mainly an index file for tracing names and their possible different spellings, the SIMS-System gives access to the digitised documents containing relevant information about the individual.
When Bloomfield spoke to the JTA reporter, she knew that the document images were searchable not as “machine-readable jpegs” but as image components in a vast database. The database is now in XML universally readable, Internet-ready format. It can be transferred anywhere in a hard drive. More than that, Bloomfield knew when speaking to the JTA that Bad Arolsen data searches could be done by any authorized computer terminal anywhere in the world. In May 2007, an ITS letter on the subject explained that “Option 3” for data transfer was no transfer at all, but merely a simple and immediate remote access to its own databases—which could be completed within about three to four months rather than years. The Bad Arolsen statement obtained by The Cutting Edge News states:
IC/ITS choice for Database Transfer
In May 2007 the IC/ITS (International Commission for the International Tracing Service) met in Amsterdam to decide on the method of giving a copy to each member state desiring to receive one.
Three options were mentioned:
- Option 1: Complete replication of Hard- and Software and the database for each member state.
- Option 2: Export of the data in an standardized data format (XML), so each member state can easily import the data in their own data system.
- Option 3: Access of the member states to the ITS Database via VPN or a similar technique.
The IC/ITS opted for option 2 and the ITS is now implementing this option.
Just as 24 USHMM staffers sitting in the Washington Museum can access those databases, trying to communicate with elderly victims across America via phone, fax and letter in a fashion bound to continue the legendary backlogs, so could a USHMM staffer or indeed other institution’s staffer sitting in New York, Florida or California. For example, a terminal could be set up at the Center for Jewish History in New York, the Jewish Division of the New York Public Library, the Center for Holocaust and Human Rights Education at Florida Atlantic University in Boca Raton, the spacious Sherman Library at Nova Southeastern University in Ft. Lauderdale, the Greater Palm Beach Jewish Federation or the American Judaism University in Los Angeles. If only one of those 24 USHMM staffers could be stationed in Brooklyn, Miami, Los Angeles, Detroit or the other locales where survivors are congregated, months could be reduced to minutes for every search.
Asked why the International Tracing Service and the USHMM did not chose not to place all Bad Arolsen files on the Internet, a Red Cross official with direct access to the decsionmaking processs replied, “Don’t ask me. Technically it will feasible to access these databases from anywhere in the world. We would just export to XML format. We could then support a virtually unlimited number of remote terminals. Member countries would not receive copies—just access. This option was not taken.”
The original August 21, 2007 article detailing the Bad Arolsen database technology is reprinted below.
Cutting Edge Exclusive
Twice Spurned--French and Red Cross Suggested It
Edwin Black August 21, 2007
Although officials of the United States Holocaust Memorial Museum have steadfastly insisted that the secret records at the International Tracing Service located at Bad Arolsen are technically not ready for the Internet, both Red Cross and senior Bad Arolsen officials deny this. Indeed, Red Cross and senior Bad Arolsen officials confirm that most of their 42 million records could be made Internet ready within three-to-four months. Moreover, the Red Cross reveals, the idea of Internet access directly from Bad Arolsen computers bypassing a complicated and costly 11-nation export and transfer was twice suggested earlier this year: once by French delegates to the Commission and again by Bad Arolsen technology officers. Both offers were refused.
The Bad Arolsen computerized search mechanisms have been misportrayed by some news reports. But in a series of conference calls with this reporter followed by a requested official written statement of technical specifications, Bad Arolsen chief technology officer Michael Hoffman and archivist Udo Yost, explained for the first time exactly how their system works. The ITS system, ten years in development, uses three interactive sets of prisoner informational data including TIFF and JPEG images of Nazi-era prisoner cards. Hoffman confirmed that given the correct name, birth date and birth city, “with a little luck, we get a hit on the full data set. If the system cannot get the correct information about a named individual on the first try, it defaults to the next probable hit using the sequence numbers, going through the candidate names. For example, for a person named “Rosenbaum,” the system first gives all the “Rosenbaums,” and then automatically gives you the next Rosenbaum, and the next Rosenbaum, until you find the correct Rosenbaum.”
Asked exactly how long a typical search of a correctly identified individual will take, even if it requires hitting all three forms of data, Yost volunteered: “If you are trained, it is quick, sometimes a matter of moments, maybe ten seconds, maybe one minute.” Yost confirmed to a Town Hall Meeting with survivors held June 18, 2007 at Nova Southeastern University that the system “does not need to be reinvented.”
A senior Red Cross technology officer was asked if placing the files on the Internet was legal and feasible. He immediately replied, “Of course.”
Indeed, all Bad Arolsen data files are now being exported to XML, the ideal Internet-ready data language now recommended by the World Wide Web Consortium, this in preparation for transfer to the USHMM.
Moreover, a Swiss Red Cross official revealed that in March 2007, France actually suggested to the eleven-nation Commission that governs the records that all Bad Arolsen files be accessed via a secure “virtual private network” on terminals of the archival repository on the territory of a given member state.
Later, the idea was one of three technology options formally proposed by Bad Arolsen officials during a mid-May 2007 Amsterdam conference of the Commission.
Asked why the nations attending the Amsterdam conference chose not to place all on the files on Internet, a Red Cross official with direct access to the proceedings replied, “Don’t ask me. Technically it will feasible to access these databases from anywhere in the world. We would just export to XML format. We could then support a virtually unlimited number of remote terminals. Member countries would not receive copies—just access. This option was not taken.”
He continued, “Had they chosen the Internet option, the records would be accessible in a matter of months.” He added, “But our role is just to propose solutions…and not our role to judge them.” Asked whether the Holocaust Museum supported France’s February proposal for Internet access, and Bad Arolsen’s May proposal, a key Commission official speaking on condition of anonymity declared cautiously, “Look, this entire process has been steered by one Organization and one country: The Holocaust Museum and America. You must ask them.”
However, Holocaust Museum spokesmen Andrew Hollinger and Arthur Berger, as well Paul Shapiro who leads the Museum’s Bad Arolsen project, repeatedly declined to be interviewed by this reporter.
Edwin Black, author of IBM and the Holocaust, has written numerous investigative articles on the Bad Arolsen collections and recently won an Integrity Award from survivors for his Bad Arolsen coverage.