----------------------------------------------------------------------
--- Knud van Eeden --- 15 Juny 2008 - 02:20 pm -----------------------
Language: Computer: BASIC: BBCBASIC: Windows: Font: Unicode: Character: World
---
There are 256 times 256, or thus 65536 characters possible in UniCode.
In this range most values are used to represent a character, others not
(these show e.g. an empty square).
---
You will need to use a Unicode font to show the characters (otherwise
you will see e.g. only empty squares).
Typical fonts are
Arial Unicode MS
(which you might have to install from the Microsoft Windows (XP) installation CD, if not present on your system).
This font contains the most Unicode characters.
Another possible Unicode font is
Tahoma
---
To draw a character, you will need to choose the first byte and the
second byte, and its xy position on the screen.
---
The languages are mostly bundled in groups of 256 characters.
E.g. Greek is in group 3, Russian is in group 4 of the totally 256 groups.
---
Usually the value is given in hexadecimal, because you can split that
number in 2 independent parts, which simplifies your calculations (e.g.
in your head).
---
E.g. character &0628 you split in &06 and &28 hexadecimal.
The first number gives the page (e.g. page 6).
The second number gives the position on that page (e.g. &28 is thus 40
decimal), between 0 and 255.
---
Here a few example programs, starting with the simplest possibility,
that is 1 character.
---
The random example varies randomly the xy position of that 1 character
on the screen, and also the value of that character (between 0 and 255
for the first byte, and 0 and 255 for the second byte, thus totally
between 0 and 65535).
---
The last example gives you an overview of all the possible Unicode
characters (that is 256 pages of 256 characters, or thus 65536
characters).
---
Note that you clearly see that the writings of some seemingly difficult
looking languages like Arabic, Hindi, ... clearly are built up of the
much simpler single characters, similar to the way we build sentences
in the Latin alphabet, by concatenating single alphabet characters to
strings. This insight could possibly boost your comprehension and speed
up your learning of that language.
===
Program: Simplest <author> Richard T. Russell </author> [kn, ri, su, 15-06-2008 16:50:38]
--- cut here: begin --------------------------------------------------
*FONT Tahoma, 16
xScreenI% = 50
yScreenI% = 50
REM shows Farsi character &0628 using Unicode font
Unicode$ = CHR$&28 + CHR$ &06
SYS "TextOutW", @memhdc%, xScreenI%, yScreenI%, Unicode$, LEN( Unicode$ ) / 2
SYS "InvalidateRect", @hwnd%, 0, 0
--- cut here: begin --------------------------------------------------
===
Program: Simple, using procedure instead
--- cut here: begin --------------------------------------------------
*FONT Tahoma, 16
:
REM shows Farsi character &0628 using Unicode font
PROCUnicodeCharacter( &06, &28, 50, 50 )
:
END
:
:
:
DEF PROCUnicodeCharacter( byte1I%, byte2I%, xScreenI%, yScreenI% )
LOCAL uniCode$
uniCode$ = CHR$( byte2I% ) + CHR$( byte1I% )
SYS "TextOutW", @memhdc%, xScreenI%, yScreenI%, uniCode$, LEN( uniCode$ ) / 2
SYS "InvalidateRect", @hwnd%, 0, 0
ENDPROC
--- cut here: end ----------------------------------------------------
===
Program: Drawing random Unicode characters on random position on the screen
--- cut here: begin --------------------------------------------------
*FONT Tahoma, 16
REM *FONT Arial Unicode MS, 16
REPEAT
xScreenI% = RND( 1000 )
yScreenI% = RND( 500 )
byte1I% = RND( 256 ) - 1
byte2I% = RND( 256 ) - 1
Unicode$ = CHR$( byte1I% ) + CHR$( byte2I% )
SYS "TextOutW", @memhdc%, xScreenI%, yScreenI%, Unicode$, LEN( Unicode$ ) / 2
SYS "InvalidateRect", @hwnd%, 0, 0
UNTIL FALSE
--- cut here: end ----------------------------------------------------
===
Program: Showing the 256 Unicode pages of totally 256 characters each
(totalling 256 x 256 = 65536 possibilities)
--- cut here: begin --------------------------------------------------
REM you must have a UniCode font installed
REM Check this e.g. in 'Control panel'->'Fonts'
REM (if not, install e.g. from the Microsoft (XP) CD)
REM (choose 'Install new font' in the 'File' menu
REM in the 'control panel' 'fonts' screen))
:
REM *FONT Tahoma, 16
:
*FONT Arial Unicode MS, 16
:
PROCUnicodePageAll( 0, 470, 30, 500, 30, 30 )
END
:
:
:
DEF PROCUnicodePageAll( xScreenMinI%, xScreenMaxI%, yScreenMinI%, yScreenMaxI%, xScreenStepI%, yScreenStepI% )
LOCAL byte1I%
LOCAL byte2I%
LOCAL xScreenI%
LOCAL yScreenI%
:
FOR byte2I% = 0 TO 255
:
xScreenI% = xScreenMinI%
yScreenI% = yScreenMinI%
:
CLG
:
PRINT TAB( 0, 0 );
PRINT; "language font unicode block (0-255) = "; byte2I%;
PRINT; " : <press any key>"
:
FOR byte1I% = 0 TO 255
:
PROCUnicode( byte1I%, byte2I%, xScreenI%, yScreenI% )
:
xScreenI% += xScreenStepI%
:
IF xScreenI% > xScreenMaxI% THEN
xScreenI% = xScreenMinI%
yScreenI% = yScreenI% + yScreenStepI%
ENDIF
:
NEXT byte1I%
:
REPEAT UNTIL GET
:
NEXT byte2I%
:
ENDPROC
:
DEF PROCUnicode( byte1I%, byte2I%, xScreenI%, yScreenI% )
LOCAL uniCode$
uniCode$ = CHR$( byte1I% ) + CHR$( byte2I% )
SYS "TextOutW", @memhdc%, xScreenI%, yScreenI%, uniCode$, LEN( uniCode$ ) / 2
SYS "InvalidateRect", @hwnd%, 0, 0
ENDPROC
:
--- cut here: end ----------------------------------------------------
===
Program: Showing the 256 Unicode pages of totally 256 characters each
(totalling 256 x 256 = 65536 possibilities)
with a description of the languages used on this page
and the hexadecimal position of each character on each page.
--- cut here: begin --------------------------------------------------
REM --- MAIN --- REM
REM you must have a UniCode font installed
REM Check this e.g. in 'Control panel'->'Fonts'
REM (if not, install e.g. from the Microsoft (XP) CD)
REM (choose 'Install new font' in the 'File' menu
REM in the 'control panel' 'fonts' screen))
:
REM *FONT Tahoma, 16
:
*FONT Arial Unicode MS, 11, B
:
PROCDataGetUnicodeLanguagePage
:
PROCTextViewUnicodePageAllCode( 0, 470, 30, 500, 30, 30 )
END
:
:
:
REM --- LIBRARY --- REM
:
REM library: text: view: unicode: page: all: code <description>View all Unicode pages (including hexadecimal position code)</description> (filenamemacro=viewteac.bbc) [kn, ri, su, 04-01-2009 22:00:33]
DEF PROCTextViewUnicodePageAllCode( xScreenMinI%, xScreenMaxI%, yScreenMinI%, yScreenMaxI%, xScreenStepI%, yScreenStepI% )
REM e.g. REM you must have a UniCode font installed
REM e.g. REM Check this e.g. in 'Control panel'->'Fonts'
REM e.g. REM (if not, install e.g. from the Microsoft (XP) CD)
REM e.g. REM (choose 'Install new font' in the 'File' menu
REM e.g. REM in the 'control panel' 'fonts' screen))
REM e.g. :
REM e.g. REM *FONT Tahoma, 16
REM e.g. :
REM e.g. *FONT Arial Unicode MS, 11, B
REM e.g. :
REM e.g. PROCDataGetUnicodeLanguagePage
REM e.g. :
REM e.g. PROCTextViewUnicodePageAllCode( 0, 470, 30, 500, 30, 30 )
REM e.g. END
REM e.g. :
REM e.g. :
REM e.g. :
LOCAL byte1I%
LOCAL byte2I%
LOCAL xScreenI%
LOCAL yScreenI%
LOCAL s$
:
FOR byte2I% = 0 TO 255
:
xScreenI% = xScreenMinI%
yScreenI% = yScreenMinI%
:
CLG
:
READ s$
*FONT Arial Unicode MS, 8
PRINT TAB( 0, 0 ); "content of this code page = "; s$
PRINT TAB( 0, 1 );
PRINT; "language font unicode block (0-255) = "; byte2I%;
COLOUR 1
PRINT; " (=page &"; STR$ ~byte2I%; ")";
COLOUR 0
PRINT; " : <press any key>"
*FONT Arial Unicode MS, 11, B
:
FOR byte1I% = 0 TO 255
:
PROCUnicode( byte1I%, byte2I%, xScreenI%, yScreenI% )
:
VDU 5
GCOL 0,1
MOVE xScreenI% * 2, 1000 - yScreenI% * 2
*FONT Arial Unicode MS, 8
PRINT; "&"; STR$ ~byte1I%
*FONT Arial Unicode MS, 11, B
VDU 4
:
xScreenI% += xScreenStepI%
:
IF xScreenI% > xScreenMaxI% THEN
xScreenI% = xScreenMinI%
yScreenI% = yScreenI% + yScreenStepI%
ENDIF
:
NEXT byte1I%
:
REPEAT UNTIL GET
:
NEXT byte2I%
:
ENDPROC
:
DEF PROCUnicode( byte1I%, byte2I%, xScreenI%, yScreenI% )
LOCAL uniCode$
uniCode$ = CHR$( byte1I% ) + CHR$( byte2I% )
SYS "TextOutW", @memhdc%, xScreenI%, yScreenI%, uniCode$, LEN( uniCode$ ) / 2
SYS "InvalidateRect", @hwnd%, 0, 0
ENDPROC
:
REM library: data: get: unicode: language: page <description></description> (filenamemacro=getdalpa.bbc) [kn, ri, su, 04-01-2009 23:06:02]
DEF PROCDataGetUnicodeLanguagePage
DATA Basic Latin; Latin-1 supplement
DATA Latin Extended-A; Latin Extended-B
DATA Latin Extended-B; IPA Extensions; Spacing Modifier Letters
DATA Combining Diacritical Marks; Greek and Coptic
DATA Cyrillic
DATA Cyrillic Supplement; Armenian; Hebrew
DATA Arabic
DATA Syriac; Arabic Supplement; Thaana; N'Ko (Mandenkan)
DATA Unused page
DATA Devanagari; Bengali
DATA Gurmukhi; Gujarati
DATA Oriya; Tamil
DATA Telugu; Kannada
DATA Malayalam; Sinhala
DATA Thai; Lao
DATA Tibetan
DATA Burmese (Myanmar); Georgian
DATA Hangul Jamo
DATA Ethiopic
DATA Ethiopic; Ethiopic Supplement; Cherokee
DATA Unified Canadian Aboriginal Syllabics
DATA Unified Canadian Aboriginal Syllabics
DATA Unified Canadian Aboriginal Syllabics; Ogham; Runic
DATA Tagalog; Hanunóo; Buhid; Tagbanwa; Khmer
DATA Mongolian
DATA Limbu; Tai Le; New Tai Lue; Khmer Symbols
DATA Buginese
DATA Balinese; Sundanese
DATA Lepcha (Rong); Ol Chiki (Santali / Ol Cemet')
DATA Phonetic Extensions; Diacrital marks
DATA Latin Extended Additional
DATA Greek Extended
DATA Punctuation; Superscripts; Subscripts; Currency; Diacritics
DATA Letterlike Symbols; Number Forms; Arrows
DATA Mathematical Operators
DATA Miscellaneous Technical
DATA Control Pictures; Optical Character Recognition; Enclosed Alphanumerics
DATA Box Drawing; Block Elements; Geometric Shapes
DATA Miscellaneous Symbols
DATA Dingbats; Mathematical Symbols-A; Supplemental Arrows-A
DATA Braille Patterns
DATA Supplemental Arrows-B; Mathematical Symbols-B
DATA Supplemental Mathematical Operators
DATA Miscellaneous Symbols and Arrows
DATA Glagolitic; Latin Extended-C; Coptic
DATA Georgian Supplement; Tifinagh; Ethiopic Extended; Cyrillic Extended-A
DATA Supplemental Punctuation; CJK Radicals Supplement
DATA Kangxi Radicals; Ideographic Description Characters
DATA CJK Symbols and Punctuation; Hiragana; Katakana
DATA Bopomofo; Hangul Compatibility Jamo; Kanbun; Bopomofo Extended; CJK Strokes; Katakana Phonetic Extensions
DATA Enclosed CJK Letters and Months
DATA CJK Compatibility
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A
DATA CJK Unified Ideographs Extension A; Yijing Hexagram Symbols
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA CJK Unified Ideographs
DATA Yi Syllables
DATA Yi Syllables
DATA Yi Syllables
DATA Yi Syllables
DATA Yi Syllables; Yi Radicals
DATA Vai
DATA Vai; Cyrillic Extended-B
DATA Modifier Tone Letters; Latin Extended-D
DATA Syloti Nagri; Phags-pa; Saurashtra
DATA Kayah Li; Rejang
DATA Cham
DATA Unused page
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA Hangul Syllables
DATA High Surrogates
DATA High Surrogates
DATA High Surrogates
DATA High Surrogates; High Private Use Surrogates
DATA Low Surrogates
DATA Low Surrogates
DATA Low Surrogates
DATA Low Surrogates
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA Private Use Area
DATA CJK Compatibility Ideographs
DATA CJK Compatibility Ideographs
DATA Alphabetic Presentation Forms; Arabic Presentation Forms-A
DATA Alphabetic Presentation Forms; Arabic Presentation Forms-A
DATA Alphabetic Presentation Forms; Arabic Presentation Forms-A
DATA Variation Selectors; Vertical Forms; Half Marks; CJK Compatibility Forms; Small Form Variants; Arabic Presentation Forms-B
DATA Halfwidth and Fullwidth Forms; Specials
ENDPROC
:
--- cut here: end ----------------------------------------------------
===
Book: see also:
===
Diagram: see also:
===
Help: see also:
===
Image: see also:
Single UniCode character
Single UniCode character with supplemental information
Random unicode characters
Ascii
East European, Welsh, ...
Greek
Russian (Cyrillic)
Russian (Cyrillic) with supplemental information
Hebrew
Arabic
Hindi (India)
Japanese
Chinese
Korean
===
Internet: see also:
Find your Unicode character for your given language: Overview
http://unicode.org/charts/">http://unicode.org/charts/">http://unicode.org/charts/
---
Find your Unicode character for your given language (Latin, Greek, Cyrillic, Armenian, Hebrew, Arabic, Devanagari (India, Nepal), Bengali (India), Gurmukhi (India), Oriya (India), Tamil (Sri Lanka), Telugu (India), Kannada (India), Malayalam, Sinhala (Sri Lanka), Thai, Lao, Tibetan, Myanmar, Georgian, Hangul (Korea), Hiragana (Japan), Katakana (Japan), Han (simplified Chinese - China), Tifinagh (Berber), Balinese (Bali), ...)
http://www.unicode.org/charts/normalization/
---
Download: Unicode in BBCBASIC (executable)
---
efg's Unicode Lab Report
http://www.efg2.com/Lab/OtherProjects/Unicode.htm
---
Where can I get the Unicode fonts?
http://tlt.its.psu.edu/suggestions/international/web/unicode.html
---
Microsoft Windows: API: TextOutW
http://64.233.183.104/search?q=cache:-hfl5kT4RJ8J:msdn.microsoft.com/en-us/library/ms534019(VS.8
---
Unicode: Delphi
http://www.experts-exchange.com/Programming/Languages/Pascal/Delphi/Q_23279239.html
---
Unicode: Can you give an overview of links?
http://www.faqts.com/knowledge_base/view.phtml/aid/38864/fid/1852
===
Screencast: see also:
===
Table: see also:
===
Video: see also:
---
----------------------------------------------------------------------