Swordsmith Index: changes coming soon

As some of you may have noticed, there have already been some changes in the Swordsmith Index in recent weeks. This announcement provides details on the ongoing and planned updates for the Index.

Macrons for long vowels

When the first version of the Index was released nearly 3 years ago, a decision was made to give the ability to search and maintain data the highest priority. In order to facilitate that, all the names and terms were transformed to plain ASCII format (given the overall state of the internet technology at the time). There was some research done recently by Google which shows that Unicode is now widely adapted and supported across the Internet. This gave me reassurance that it's time to revise this decision. Since we migrated to a new platform 6 months ago, it is technologically feasible now to take another step forward and review the way Japanese names and terms are presented in Swordsmith Index and throughout the site. The proposed changes are also based on feedback from visitors and members of Nihonto Club.

Most importantly names of smiths, schools, provinces, eras as well as signatures will now be presented using macrons (ō and ū) for long o and u. Submission guidelines for Japan-related articles in Wikipedia are worth to be adapted: Manual of Style (Japan-related articles).

This change won't affect searching as both internal search and Google search can process macrons correctly (e.g. if we search for 'Bishu Osafune' it will find both 'Bishu Osafune' and 'Bishū Osafune'). This should make it easier to reconcile Rōmaji with Kanji, and also help non-Japanese readers with pronunciation. As this is (mostly) manual process, it will take a while to migrate the whole site with tens of thousands records into the new format.

Archaic pronunciations

Rōmaji for iye and suye (家, 末 and alternatives) will be replaced by modern forms ie and sue. E.g. Masaiye and Iyetsugu will be displayed and Masaie and Ietsugu. To find more details about archaic pronunciations, see Historical kana orthography and Romanization of Japanese.

I'm still looking at provisions to make the names searchable by both archaic and modern forms.

Prefectures

In the past it was only possible to choose Province in the swordsmith record. Now the list of provinces is extended with a full list of modern prefectures in order to allow correct data entry for modern smiths.

Data quality indicators

There is a 'verified' flag available for each smith record. If true, the record meets minimal requirements (name-province-era) and doesn't contradict what's been recorded in Hawley's for the given smith ID and name. Verification is manual. As data quality slowly improves, more grades may be needed in order to give a simple indication how reliable the record is. 'True' for Verified flag would correspond to the grade 1 (out of 5). While the actual grade system still has to be decided, grades 2 and 3 may correspond to the situation when the record was checked against well-known smith directories and books, like Toko Taikan, Fujishiro's and Meikan and provides enough details to identify the smith confidently. Grades 4 and 5 would be given to records which represent historically important and well known smiths who had prominent and well known works. The higher the grade, the easier it should be when reading a book or a magazine or looking at the sword to tell which smith is mentioned. As a result, the record of a very well known smith which had a number of indistinguishable generations would never get higher grades. I would consider the ability to narrow down the search for smith data to be the most important purpose of the Index.

Smith IDs

Currently smith records use IDs from Hawley's 'Japanese Swordsmiths'. While this approach being widely accepted in Western Nihonto circles, it imposes some restrictions on the expansion of the Index. An alternative system will be proposed, but current facilities to address smith record by Hawley IDs will be left intact. Records which use 'extended' Hawley IDs (IDs which didn't exist in the original book and were added later. They have prefix tmp in front of the ID in the Index) and which don't have easily identifiable provenance will be deleted.

Browsing

As the size of signature database grows, primary way of browsing the Index will be slowly shifting towards signature based approach.


On a side note, all the smith records which were missing both Province and Era are now manually corrected and enriched with data (that is about 750 records in total). In the coming months I'll be targeting all other records which lack the most important information. There may be a possibility that by the end of 2010 Index verification will be completed (that is, all the records will have at least data quality grade 1). All this become possible since I moved to a new iMac which greatly increases the productivity of maintaining Japanese related texts.