4

I am looking for a Welsh language word list on my Ubuntu system. apt-file search /usr/share/dict/ doesn't show an option. However aspell-cy does exist. The official description is "This package contains all the required files to add support for the Welsh language to the GNU Aspell spell checker." I have installed it but can't find the word list it should be using.

Where can I find a Welsh language word list?

terdon
  • 234,489
  • 66
  • 447
  • 667
Simd
  • 325
  • 1
  • 2
  • 11

1 Answers1

5

I don't know the details of how aspell works, but it looks like it has its word lists in some binary format and not as simple text files. I believe you are looking for a file called cy.cwl.

I downloaded the tar.gz file from your launchpad link, decompressed it and ran ./configure and then make:

$ wget https://launchpad.net/ubuntu/+archive/primary/+sourcefiles/aspell-cy/0.50-3-6.2/aspell-cy_0.50-3.orig.tar.gz 2>/dev/null 
$ tar xvzf aspell-cy_0.50-3.orig.tar.gz 
aspell-cy-0.50-3/
aspell-cy-0.50-3/doc/
aspell-cy-0.50-3/doc/gpl_1.0.txt
aspell-cy-0.50-3/info
aspell-cy-0.50-3/cy.multi
aspell-cy-0.50-3/README
aspell-cy-0.50-3/configure
aspell-cy-0.50-3/Copyright
aspell-cy-0.50-3/cy.dat
aspell-cy-0.50-3/cy.cwl
aspell-cy-0.50-3/welsh.alias
aspell-cy-0.50-3/Makefile.pre
aspell-cy-0.50-3/COPYING
$ cd aspell-cy-0.50-3/
$ ./configure 
Finding Dictionary file location ... /usr/lib/aspell-0.60
Finding Data file location ... /usr/lib/aspell-0.60
$ make
word-list-compress d < cy.cwl | aspell  --lang=cy create master ./cy.rws

The command printed by make looked like it was decompressing (d) something and then passing it to aspell to create the cy language entry. And, indeed, running the first part of the command printed out the Welsh dictionary:

 $ word-list-compress d < cy.cwl | head -n30
'ch
'i
'm
'ma
'n
'na
'r
'th
'u
'w
Aberdaugleddyf
Abergwaun
Aberhonddu
Abermo
Abertawe
Aberteifi
Adda
Adfent
Affrig
Ahasferus
Aifft
Ailfedyddiwr
Ailfedyddwyr
Alban
Albanwr
Albanwyr
Almaen
Almaenaidd
Almaeneg
Almaenes

So, either do what I did above, or just locate cy.cwl on your system, and then:

word-list-compress d < cy.cwl > welsh.dict
terdon
  • 234,489
  • 66
  • 447
  • 667
  • Thank you! I tried exactly the same steps with aspell-ru but I get word-list-compress d < ru.cwl ERROR: Corrupt Input. Do you have any idea why? – Simd Jan 29 '21 at 12:39
  • @Anush you're welcome. I added a comment to your new question, but I am not an expert on aspell and its formats. I only found out what I have in my answer yesterday when I tried to answer your question. – terdon Jan 29 '21 at 12:49
  • Thanks. I added that the information you mentioned to my other question. – Simd Jan 29 '21 at 13:18