Real Software Forums

The forum for Real Studio and other Real Software products.
[ REAL Software Website | Board Index ]
It is currently Wed Jul 18, 2018 11:26 am
xojo

All times are UTC - 5 hours




Post new topic Reply to topic  [ 1 post ] 
Author Message
 Post subject: Newbie question: creating a word list in a database?
PostPosted: Tue Feb 26, 2013 5:17 am 
Offline

Joined: Sun Feb 19, 2006 4:00 pm
Posts: 1282
Location: Heidelberg, Germany
Hi all,

my last SQL experience was about 25 years ago, and my current attempt at getting my head around databases hits the usual amount of brickworks (well, maybe more than usual).

WI'm using the build-in database, and what I need to do is basically make a table of sentences, create a word list including a count, and correlate these two (in reality it is protein sequences and peptides from a digest but that's just semantics).

As I understand it:

• sentences have many words
• each work can be in many sentences
-> so there should be a many to many relationship for which I need a link table

Table SENTENCES
- Sentence_ID
- SentenceText
- NumberOfOccurences

Table WORDS
- Word_ID
- WordText
- NumberOfOccurences

Table WORDS_IN_SENTENCES
- Sentence_ID
- Word_ID

My problem now: words and sentences should only occur once in their respective tables. There is a property NumberOfOccurences which keeps track of how often a word occurs.

But how to do this in actuality?

To ensure that a word only occurs once I set the WordText to be unique.

But how do I now insert a word into the table?

I add a sentence to the table SENTENCES, split it, ... and now I'm unsure of how to proceed. How do I insert the word into the WORDS table and WORDS_IN_SENTENCES table?

I have a feeling that I need to check if a word is in WORDS, if so just add it to WORDS_IN_SENTENCES, if not add it to both, and at the very end (after processing all sentences) for each word count in WORDS_IN_SENTENCES how often it appears. Would that be the correct way of doing it? And how do I get the ID?

Considering my "word list" could be several hundred thousand's of words long speed might be an issue ...

Thanks for any advice.

Markus


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 1 post ] 

All times are UTC - 5 hours


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group