Spell Check Dictionary Improvements
Wednesday, February 11, 2009
If you're anything like us, you're spending more and more of your time working online. The spellchecker built into Chromium can be a big help in keeping your blog, email, documents, and forum postings spelled correctly and easy to read. Chromium integrates the popular open source library Hunspell with WebKit's built-in spellchecking infrastructure to check words and to provide suggestions in 27 different languages.
The Hunspell dictionary maintainers have done a great job creating high-quality dictionaries that anybody can use, but one of the problems with any dictionary is that there are inevitably omissions, especially as new words appear or proper nouns come into common use. We at Google are in a good position to use our knowledge of the internet to identify and fix some of these omissions. The Google translation team used their language models to generate a sorted list of the most popular words in each language. This was cross-checked with the Hunspell dictionaries to generate a list of the top 1000 words not present in each dictionary. This list includes many popular words, but also common misspellings. To remove these words, each list was reviewed by specialist in that language. Generally, we tried to keep proper nouns and even foreign words as long as they were in common usage.
We hope that by using the the existing GPL/LGPL/MPL tri-license for our addition, our work can be picked up by other users of Hunspell. We also hope to make more improvements in the future, both for additional languages like Turkish, and to refine the word lists we already have. If you're passionate about your language, you can help out by writing affix rules for the added words or reviewing more word lists.
The recent dev-channel release of Google Chrome (2.0.160.0) has the additional words we generated for 19 of the languages. Hopefully, you'll see fewer common words marked as misspelled. For example, the English dictionary now includes "antivirus," "anime," "screensaver," and "webcam," and commonly used names such as "BibTeX," "Mozilla," "Obama," and "Wikipedia." For our scientific users, we even have "gastroenterology," "oligonucleotide," and "Saccharomyces"! We'd like to give special thanks to the great help we got from the translation team who generated the words and the language search specialists who reviewed the lists.

28 comments:
Adam said...
Hooray for BibTex!
February 11, 2009 5:11 PM
yukuku said...
I hope the Indonesian dictionary is edited professionally. It has a lot of false-alerts on conjugated words, although root words are mostly okay.
February 11, 2009 7:14 PM
Stefan said...
as a horrible spell I lean heavily on browser spell checks. I love chrome but I often miss FF's spell check. I keep hopeing you will some how integrate "did you mean" in to the spell check. As "did you mean" is the best spell check in the world.
February 11, 2009 8:11 PM
Christin said...
Hooray! It had been bothering me for months that Schwarzenegger was in the Chrome dictionary, but not Obama. Good call!
February 12, 2009 3:19 AM
Olaf Lederer said...
I have to write email, posts and comments in three different languages. In FF is it possible to switch between the dictionaries (right mouse click...). Since this feature is not available in Chrome I use that browser only for gmail and analytics :(
February 12, 2009 4:29 AM
VicMatson said...
I hope the spell check gets as good as the one in Google toolbar? Someday.
February 12, 2009 5:32 AM
sidchat said...
@Olaf Lederer
The recent developer version of Chrome (2.0.162.0) now has the ability to change spell check dictionary language by right clicking on the text field.
You can switch your current Chrome build to the developer channel by following instructions in
http://dev.chromium.org/getting-involved/dev-channel
February 12, 2009 12:02 PM
المعلم حمادة said...
I've lived in America without a cent to my name since 2000. I don't do conscious barter, either, because conscious barter is just money in bulkier form.
a website I just got finished:
http://ebdaa.yoo7.com
February 14, 2009 12:23 AM
barnabasnagy said...
Hey, thanks for the post! I was googling this because I wrote on the same topic (http://is.gd/jXH5) and then I found you. Keep up the good work!
February 18, 2009 6:26 AM
Dicollecte said...
Hi,
Some dictionaries seem to be very old ones.
For example, the French dictionary is from 2002. This one contains a lot of mistakes. And it is an old myspell affixes structure.
A lot of improvements have been made since 2002.
You should have a look here:
http://wiki.services.openoffice.org/wiki/Dictionaries
The French dictionaries:
http://dicollecte.free.fr/download.php?prj=fr
February 19, 2009 12:08 AM
Brett Wilson said...
Dicollecte:
Thanks for the report, I filed this as a bug:
http://code.google.com/p/chromium/issues/detail?id=7966
February 23, 2009 10:46 AM
Mark said...
can we use Google toolbar on Google chrome? Does spell check offer fix it pop ups to help spell the word correctly. When I am in a hurry, I don't have time to respell the word. Please help me.
February 24, 2009 9:05 PM
Pavel said...
The biggest flaw of the Chrom's spellchecker is inability to change language on the fly. For all of us, who use more than one language on daily basis, the inability to switch language is make this feature almost worthless.
March 1, 2009 2:13 AM
sidchat said...
@Pavel
The developer version of Chrome has the ability to change spell check language on the fly.
You can switch your current Chrome build to the developer channel by following instructions in
http://dev.chromium.org/getting-involved/dev-channel
March 2, 2009 11:53 AM
Tom said...
Hunspell is available for the .NET Framework too. NHunspell is a .NET version of Hunspell build with managed C++. So you can use the Hunspell dictionaries in your own .NET applications if you like. NHunspell is a free (LGPL) licensed spell checker.
March 3, 2009 3:37 PM
JustLocal said...
Hi,
I'm the creator and maintainer of the Australian English spellcheck dictionary files which can be found at www.dictionary.JustLocal.com.au.
It would be greatly appreciated if you could add Australian English as an option so I don't have to fudge the system anymore. Then I and others could just copy the required dictionary files into the appropriate folders.
Thanks in advance.
Kelvin Eldridge
March 13, 2009 9:44 AM
sidchat said...
@JustLocal
Thank you very much for the information. A bug has been filed on this:
http://code.google.com/p/chromium/issues/detail?id=8934
March 18, 2009 11:50 AM
Steve said...
Spell check in Hotmail does not work when using Google Chrome. The Spell checker in Chrome is turned on. Spell check works when using Gmail and Yahoo Mail but does not in Hotmail.
March 30, 2009 4:53 AM
Fish & Chips said...
Hi Everyone..
Congratulations for the chrome..
The portuguese language (Brazil), has been changed last month, those new changes will be implemented ?
Best regards,
Rafael Peixe
April 13, 2009 6:48 AM
John said...
Can we sugest words to add? How?
December 1, 2009 6:16 PM
Nick Demou said...
Can google publish the list of words that google apps (gmail etc) mark as spelling-errors but users choose to ignore. If yes it would be great!
(I see how you help a lot when you can so I thought I could drop a thought)
February 3, 2010 4:44 AM
Jason said...
Not sure if it's a KDE issue or a chromium issue, but when KDE highlights a word inside chromium (ie: "definilty" ) as misspelled, and I right click and choose the corrected word the program splits and leaves the old word and inserts the new one (ie "defini definitely tly").
thanks,
Jason
February 5, 2010 1:36 PM
Jason said...
Oops, sorry, I forget to mention my specific configureation:
I'm using openSUSE 10.2 and chromium installed from the google yast repository.
I've seen this issue in both gMail and Zimbra.
February 5, 2010 1:41 PM
Bay area shirts said...
The Infoplease spelling checker combines spelling help with our dictionary and thesaurus helps a lot.
Hard Drive Recovery
March 17, 2010 10:45 AM
Vince said...
The spell checker S***s in chrome. It will say the word is spelled wrong and not give me and options to fix. so i have to go to Google really quick paste it in and the word comes up spelled correctly in there. What is up with that? makes entirely no sense to me.
March 17, 2010 4:15 PM
James C. Smith said...
It's nice to hear the word list was improved but the real problem with Chrome's spell checker is the suggestions it has for correcting a misspelled word. 30 times per day I end up copy/pasting my misspellings into a Google search to get a useful suggestion for how to correct my mistake. Chrome accurately finds all my mistakes but is hardly ever helpful when it comes to correcting them.
March 18, 2010 10:57 AM
mikeqw said...
I'm the creator and maintainer of the Australian English spellcheck dictionary files which can be found at auto insurance quotes, adipex without prescription, cheap auto insurance It would be greatly appreciated if you could add Australian English as an option so I don't have to fudge the system anymore. Then I and others could just copy the required dictionary files into the appropriate folders.
June 24, 2010 12:42 PM
M. C. Battilana said...
You wrote "We hope that by using the the existing GPL/LGPL/MPL tri-license for our addition". What about releasing Google's own wordlists (the ones used to determine the top 1000 missing words) under the same license?
As it is now, Google's 1000 extra words are under the tri-license, but the base dictionaries are not always (some appear to be GPL3 only), making legal use under the other licenses (including GPL2) dubious.
September 16, 2010 9:12 AM
Post a Comment