Real Software Forums

The forum for Real Studio and other Real Software products.
[ REAL Software Website | Board Index ]
It is currently Sat Nov 18, 2017 6:30 pm
xojo

All times are UTC - 5 hours




Post new topic Reply to topic  [ 5 posts ] 
Author Message
 Post subject: Sorting of Portuguese Language
PostPosted: Fri May 31, 2013 9:36 am 
Offline

Joined: Wed Jul 28, 2010 4:03 am
Posts: 13
"ábaco" (first letter is á, not a) is a Portuguese word which means "abacus".

If I use array.sort() command to sort a Portuguese vocabulary, "ábaco" is placed after "zoo". It is wrong.

Please tell me how to sort alphabetically. Thanks.

_________________
My Freeware
http://www.italian.org.cn/web/


Top
 Profile  
Reply with quote  
 Post subject: Re: Sorting of Portuguese Language
PostPosted: Mon Jun 03, 2013 9:04 am 
Offline

Joined: Thu Apr 10, 2008 6:03 am
Posts: 303
Location: Paris-La Défense, France
Hi,

In my apps I use the following function:

Sub SortAccentuatedArray(ByRef Data() As String)

Dim accents As String = "àáâãäçèéêëìíîïñòóôõöùúûüýÿÀÁÂÃÄÇÈÉÊËÌÍÎÏÑÒÓÔÕÖÙÚÛÜÝ"
Dim correct As String = "aaaaaceeeeiiiinooooouuuuyyAAAAACEEEEIIIINOOOOOUUUUY"
dim temp() As String

Dim i, j As Integer
Dim uData As Integer = UBound(Data)
Dim str As String
Dim uLen As Integer
Dim Pos As Integer

For i = 0 to uData
str = Data(i)
uLen = Len(str)

For j = 0 to uLen

If asc(str.Mid(j, 1)) > 127 then
Pos = accents.instr(str.Mid(j, 1))
If Pos > 0 then
//Replace the accentuated char.
//Using Replace is faster than using ReplaceAll
str = Replace(str, accents.Mid(Pos, 1), correct.Mid(Pos, 1))

End If
End If
Next


temp.Append str

Next

temp.SortWith(Data)


End Sub


You can test it by adding the function to a Window.
Then add the following code in the Window.open event:

dim a() As String

a = Array("Zoo", "ère", "arbre", "ábaco")

SortAccentuatedArray(a)

MsgBox(Join(a, EndOfLine))


[Edit]: Updated the code to improve performance by 30% on an Array of 89000 entries.
The sort takes ~780ms for the regular Array.Sort function
And takes ~2.800ms for the SortAccentuatedArray function.

_________________
Check my Website for high quality custom controls and classes (no plugins) for Windows, Mac OS and Linux
REALBasic 2012 R2 on Win 7 & Mac OS X


Top
 Profile  
Reply with quote  
 Post subject: Re: Sorting of Portuguese Language
PostPosted: Mon Jun 03, 2013 11:21 am 
Offline

Joined: Sat Oct 01, 2005 9:55 am
Posts: 527
That looks like it might fold the case as well as the accents. Not a problem really for sorting, but it could cause issues if you reuse the stripping code for something else. You could avoid that issue by using the bytewise string functions (MidB, etc.), as long as you are working with known and matching encodings.


Top
 Profile  
Reply with quote  
 Post subject: Re: Sorting of Portuguese Language
PostPosted: Mon Jun 03, 2013 11:46 am 
Offline

Joined: Thu Apr 10, 2008 6:03 am
Posts: 303
Location: Paris-La Défense, France
Could you please elaborate silverpie ?

I'm sorry I don't understand what you mean by "it might fold the case as well as the accents".

The function I wrote doesn't modify anything in the passed Array.
It only creates a new array with no accentuated characters and does the sorting from there.

_________________
Check my Website for high quality custom controls and classes (no plugins) for Windows, Mac OS and Linux
REALBasic 2012 R2 on Win 7 & Mac OS X


Top
 Profile  
Reply with quote  
 Post subject: Re: Sorting of Portuguese Language
PostPosted: Mon Jun 03, 2013 11:55 am 
Offline

Joined: Fri Jan 06, 2006 3:21 pm
Posts: 12388
Location: Portland, OR USA
The bytewise string functions may not play well with UTF8 (multibyte) data.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC - 5 hours


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group