![]() |
JAT Bulletin 184-185, July-August 2000, June 17 JAT Meeting Report
JAT was kind enough to invite me to speak on the jeKai
dictionary project at JAT's monthly meeting on Saturday, June
17.
The jeKai project to prepare a free, open, online
Japanese-English dictionary began soon after this year's IJET
in Kyoto. While still in its infancy, the project is moving
along steadily. You can see the project's current status and
find out how to participate at
http://www.jekai.org/
I didn't count the audience, but the room was nearly full; I
would guess that there were maybe thirty people there. Most
seemed to know about jeKai in advance, though I suspect that
current jeKai contributors and members of the jeKai mailing
list were in the minority. Perhaps half the attendees were
Japanese and half were foreign. As is usual at JAT meetings,
the Japanese spoke mostly in Japanese and the foreigners in
English.
After brief self-introductions by everyone present, I spoke
about the origin and goals of jeKai for about thirty minutes.
I used a printout I had prepared in advance listing the
features of jeKai as they appear in the description on the
Web site:
* Definitions that explain the meaning of words as
completely as possible
* As many examples as possible of each word in real
contexts
* Photographs and other illustrations, especially for
entries about uniquely Japanese things
* No restrictions on the type or range of vocabulary
* No restrictions on the length of entries
I had also printed out three jeKai entries, for 赤提灯,
彼岸花, and 地球儀, so that people could see the sorts of
entries that have been produced so far. These entries, and
all of the others, can be seen at the Web site.
The rest of the time was devoted to questions and
discussion. I can't summarize everything (the entire meeting
lasted for about two hours), so I'll just mention a few
topics that hadn't been discussed in depth on that jeKai
mailing list yet: corpora, editorial control, and ensuring a
steady flow of new entries.
Traditionally, the best dictionaries have been produced by
assembling vast numbers of citations from written works and
using those citations to make judgments about the meanings,
usages, and histories of words. The most famous example of a
dictionary produced in this way in the Oxford English
Dictionary, and a recent best-seller, "The Professor and the
Madman," told one part of that story in an entertaining way.
Until recently, the citation gathering had to be done by
hand, but now computers have made it possible to gather and
search vast collections of texts--called "corpora" (singular:
"corpus")--much faster and more efficiently. As one example,
I mentioned the COBUILD corpus and dictionary project in the
U.K., which has yielded an excellent dictionary for learners
of English. Other recent English dictionaries, including the
New Oxford Dictionary of English and the Encarta World
English Dictionary, are also based on electronic
corporate.
Though I have heard rumors of corpus projects for Japanese,
I don't know of any Japanese dictionary that is corpus-based.
Lexicographer and jeKai contributor Hitoshi Sobo (惣坊均) has
given some thought to the idea of producing a free corpus of
Japanese, and he described his ideas at the meeting. Some
others also mentioned an interesting bilingual corpus of
Japanese and English that is available at
http://www.idd.tamabi.ac.jp/corpus/yourei/index.htm
Ideally, a corpus would be drawn from a collection of texts
(including transcripts of speech) that are somehow balanced
and representative, and COBUILD and other corpora strive
toward that goal. Since jeKai contributors do not have access
to such a corpus of Japanese, we must make do mostly with
citations from the Web. The Web is great for size and cost,
of course, but it is not really representative of the full
range of Japanese. Searches for Japanese words tend to yield
many hits from government reports, corporate documents,
bulletin board logs, and, especially, personal diaries. The
quality of the language is mixed, and it can be hard,
especially for us nonnative speakers of Japanese, to spot
errors and to distinguish between standard usages and
personal idiosyncracies. The Web is also very weak for
literature, expository prose, transcripts of spoken language,
and anything more than five or six years old. If there were a
free corpus such as Sobo-san has proposed, then it would be a
great boon not only to jeKai and other lexicographic projects
but also to Japanese language educators, linguists, and other
scholars.
A couple of people raised the question of how and whether
editorial control is to be exerted at jeKai. Among other
things, Brian Chandler spoke about the photo.net model of
incremental additions to articles, and he mentioned the
importance of preserving previous versions of articles. I
hope that we can implement such a system in the near
future.
Another issue is where to draw the line with questionable or
offensive content. At present we have no rules about what
cannot be in jeKai. While I can imagine situations in which
we might want to exclude certain types of content, I feel
that, for the time being, we should continue to be open about
accepting contributions and refuse submissions only if they
are clearly incorrect or present imminent legal problems. If
problems do arise in the future (such as with offensive or
defamatory content), we can discuss what to do about them
here.
The most important issue raised at the meeting, I felt, was
how to encourage people to continue contributing to jeKai.
One person said he wished there were an easier interface, as
he knows no HTML. I said that, for the time being, plain text
will be fine (I will prepare the HTML). Also, Paul Flint is
working on a Web form that will make it possible to input
entries more easily. As soon as that's ready, we will put a
link on the Web site to the form. I'm also hoping that we can
prepare a revision form; if you want to correct, revise, or
add to an existing entry, you'll just click on a butto and
the entry's content will appear in a Web form, where you can
edit it in your browser.
But even more important, I think, is the challenge of
motivating ourselves to finish and send in entries. To
maintain the momentum of the project, I have been trying to
make sure that we have at least one new jeKai entry a day
(I've missed a couple of days, including the day of the JAT
meeting), and that is one reason why perhaps half of the
entries have been created by me. One person mentioned that he
wanted to prepare an entry for the word 武道, an interest of
his, but that it turned out to be an immensely complex topic.
I suggested choosing something smaller, like some specific
武道-related term, and starting with that. Since even the
simplest-seeming word can, once you look into it, contain
many unexpected facets, I also suggested that we should set
deadlines for ourselves. Give yourself at most two hours, for
example, to prepare an entry, and when you reach the deadline
send it in as it is even if there is still much more that you
could say. If we insist on completeness and perfection, we
will achieve nothing.
While jeKai is intended not only for translators but also
for students, scholars, and others interested in learning
more about meanings of Japanese words, translators are
uniquely qualified to contribute to the dictionary. I hope
that many JAT members and other translators will come to
regard jeKai as a useful and permanent forum for sharing
their knowledge about Japanese with the rest of the world.