UTF-8 encoded sample plain-text file > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Source https://antofthy.gitlab.io/info/data/utf8-demo.txt Original version from Markus Kuhn [maks kun] from University of Cambridge http://www.cl.cam.ac.uk/~mgk25/ https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt https://antofthy.gitlab.io/info/data/unicode_examples/UTF-8-demo.txt https://gist.github.com/msabramo/3921955 A UTF-8 program stress test file... https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt https://antofthy.gitlab.io/info/data/unicode_examples/UTF-8-test.txt This version has been updated with more sets, and some other symbols as a reference. This page started from the above fore mentioned example page but has expanded far beyond the original. It is particularly focused on the interactions between of Unicode characters, so they work as a seamless whole. It is meant to provide a reference of Unicode that work well together, and report what doesn't work, even though it should work. This document will often show up any problems a particular font or application has very quickly. Before using Unicode in applications I recommend you also look at "unicode.txt" in this directory... https://antofthy.gitlab.io/info/data/unicode.txt Many examples below are designed to work with a fixed-width or non-proportional font. I have also made notes about a number of fonts that I found 'promising'. See "unicode_fonts.txt" in this directory... https://antofthy.gitlab.io/info/data/unicode_font.txt So far no font has been found to be absolutely 'perfect', though a few are pretty close. General Unicode Notes and Problems... If the first non-space character in a line is special (braille, some math characters, diacritic), then all the spaces before that character may be inexplicably replaced with matching half-width spaces, even in a fixed-width font. Aargh... "Combining Diacritical Marks" generally work in all GTK applications, but often do NOT handle of "Combining Symbols" (block U+20D0) properly. EG: These very useful combined characters X s% ' ' do not always work... Anthony Thyssen, July 2020 =============================================================================== Typographical Usage of Unicode: T%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%W% Q% Q% Q% "  single and  double quotes [' \' ]' ^' Q% Q% Q% Q% " Curly apostrophes:  We ve been here Q% Q% Q% Q% " Latin-1 apostrophe and accents: '` Q% Q% Q% Q% "  deutsche  Anfhrungszeichen Q% Q% Q% Q% " , ! , %, 0 , " , 3 4,  , "5/+5, "!, & Q% Q% Q% Q% " Underline using the line below Q% Q% > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Q% Q% " ASCII safety test: 1lI|, 0ODQ, 8B Q% Q% m%%%%%%%%%%%%%%%%%n% Q% Q% " Currency in Box: % 14.95 5 2 % Q% Q% p%%%%%%%%%%%%%%%%%o% Q% Z%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%P%]% Mathematics and Science Usage: ." e"da = q, n ! ", " f(i) = " g(i), "x"!: #x # = " #"x #,  '"  = ( (" ), ' $! " ! " ! " ! " ! " ! ', a `" b a" c d" d e" e ('a' ! 'b') 2h + o ! 2h o, r = 4.7 k, "d H" "2, "5, #200 mm, "30, "" # # # # # # # %%%%%%% " # # # # # # # %a+b # # # # # # # # %%%%%% #aq -bq # # # # # # # # c i=1 # # # # # # !# In the above examples, extended bracket alignment requires a fixed with font. X Window 'fixed' fonts work correctly for all the above. True-type "Monospace","Terminus" fonts also works well for the extended brackets. The X window Misc fonts treat square root base, "Radical Symbol Bottom", '#', so that it joins with vertical box character so you can create larger multi-line square root symbols. However most true-type fonts implement the glyph more like a normal square-root symbol ('"'), and will sometime join while other times it does not. Also most truetype fonts do not ensure the double height summation (like '"') correctly join up, leaving glaring gap vertically in the final larger symbol. ------------------------------------------------------------------------------- Character Symbol Groupings... Typographical: dash: -  single apostrophe ' ndash:   double ascents mdash:  ellipsis & decent ` [' fancy \' ]' quotes ^' Spaces: I use this as a reference of 'invisible characters'. No Break Space '' \u00A0 (also called meta-space) Narrow No break '/ ' \u202F Space Unicode: \u2000 to \u200F '           ' '_ ' \u205F Math Space '(' \u2800 Braile Space (not blank in GTK "Monospace") '0' \u3000 Chinese Ideograph Space (double width) '` ` ' \u2060 Word Joiner (vim has trouble, and stuffs rest of line) '' \uFEFF Zero Width - no break space Various Symbols Grouped by simularity: Ticks & Crosses: ' ' " {# ' ' ' ' L' N' s% & i! x (last is a normal 'x') & & & ?0 & X s% ' ' (last group are combined symbols) Dot Centered: " $ ' `  e " " % Circle Centered: % % % % % "  " P  o 0 O Circled Dot: " " % % O ># &  * (goes wierd in GTK) Misc Dots: . $ @( ( ( ( (period and braile dots) Misc Sm Circles: a "  ~ Diamonds: "  # % ' % % % Sm Squares: % % # % % % % " % % " Boxed Symbols: 8# 9# ;# :# <# A# B# G# H# L# S# C# D# M# T# P# W# ^# Combined Dots: . " " " " " % (these are combined symbols) Triangles: % %  % % % % % % % % % % % % % % % % % % Stars: " & & )' +' ,' -' .' /' 0'  + "  ' ' ' )& # "' #' $' %' ' '' &' # * N  6' 2' 1' ;' <' =' >' Arrows: ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! + ! ! ! ! ! ! # ! & ! ' ' ' ' ' ! ! ' ' ! ! ' ' ' ' ' Very long in proportional truetype Arrows Multichar: %%' P%P%%%' %  % "  " z"  {" %  % %  % Punctuation: ? &$ ! < b' Quote Pairs: '' ` ""         [' \' ]' ^' Ellipses: $ % & " % % % % Hyphens: -      (  no break hyphen ) Brackets: ( (. ). ) } ~ [ ' ' ] z" 9 < ' ' > : {"     ' j" " " k" ' Vertical Bars: | X' Y' Z' % N% % % w%X'u% Maths: " " % 0 1 R 5" 4" *. +. Footnotes: ! K Q #    %& Other symbols: d' a& B& @& & >& =& & ' "& " # %$ E$ `& c& e& f& i& j& k& l& m& n& & & & & ! ! "! $ $ !$ $ 8  ) #$   z 5" 4" 6" 7" 6" " % % V' -. *. +. V ,. Y b a : g G( 6( ( # # (double width in xterm/vim) Smiley, faces: :& 9& ;& h# 2 2 0 0 0 0 0 (double width, not in X misc) Morphology " " " " " " " " " " " " " Part circle % % % % % % % % % Super-scripts (are not in sequence)... p t u v w x y z { | } ~ q       Sub-scripts Fractions S! T! U! V! W! X! Y! Z! [! \! ]! ^! % ! 0 1 Roman numerals `! a! b! c! d! e! f! g! h! i! j! k! l! m! n! o! p! q! r! s! t! u! v! w! x! y! z! {! |! }! ~! ! Number Period $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ Bracket Numbers t$ u$ v$ w$ x$ y$ z$ {$ |$ }$ ~$ $ $ $ $ $ $ $ $ $ Circle Numbers ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' $ `$ a$ b$ c$ d$ e$ f$ g$ h$ i$ j$ k$ l$ m$ n$ o$ p$ q$ r$ s$ Braille @( D( F( G( ( ( ( ( ( 6( ( @( ( ( ( ( ( ( ( ( $( ( ( H( ( "( ( These are all pretty good. Though GTK fonts do replace some characters with emoji equivalents. Also see Spinners for many unicode glyph animation sequences... https://antofthy.gitlab.io/info/ascii/Spinners.txt ------------------------------------------------------------------------------- Drawing Characters... Box Drawing, Block Elements & Geometric Shapes U+2500: Box Character Examples... %,%% S%e%V% %,%% %0%% %w%% C%@%D% F%H%E% %N%% % %O%% % % %%%%%% %<%$%%_%k%b% %<%$%% %B%(% v%<%t% =%<%>% J%K%I% %N%% % %O%% % % q%r%q%r%s%s%s% %% %% %4%% Y%h%\% %4%% %8%% %u%% E%A%F% D%G%C% %N%% % %O%% % % r%q%r%q%s%s%s% %%%%% %% % Q% % % % q%r%q%r%s%s%s% % %% %%%%% R%d%U% T%f%W% %/%% %3%% %{%% m%}%n% %2%1%'% %%%% %%%% % r%q%r%q%s%s%s% %% %% %% %%% %% % ^%j%a%P%`%l%c% %?%%%%#%K%+% z%K%x% |%K%~% %:%9%&% L%L%L%L% M%M%M%M% % % %%%% %% %%%% % X%g%[% Z%i%]% %7%% %;%% %y%% p%%o% "%-%.%*% %%%% %%%% % %%%%%%%% %% %%% % %% % T%P%f%P%W% %%,%%% m%%,%%n% m%%,%%n% !%5%6%)% % % % % % % % % % % %%% %% Q% %h%%Q% %T%g%W%% %R%j%U%% %S%A%V%% %%%%%%%% % % % % `%a%s%^%c% %b%s%_%$% %<%<%<%$% %k%B%k%$% %% %%% %%% m%n% %%%%%%%% % % %% % Q%%e%%Q% %Z%d%]%% %X%j%[%% %Y%@%\%% %% % % %%% p%o% % % %% % Z%P%i%P%]% %%4%%% p%%4%%o% p%%4%%o% %%% %%% m%n% m%n% %%%% % % %%,%% %%,%% %%2%% S%%e%V% R%P%d%U% T%P%f%W% p%<%%%<%o% %% %% %% % % %% % %% % %% Q% Q%Q% % %% Q% Q%Q% r%%q% % % % %%% %% %%<%$% %%F%+% "%%K%+% _%%k%b% ^%P%j%a% `%P%l%c% %s%% m%<%%%<%n% %% % %%4%% %%;%% %%;%% Y%%h%\% X%P%g%[% Z%P%i%]% q%%r% p%o% p%o% %% %% %%,%%%%%%%%%%,%%%%%%%%%%% %%%%%% % %%% %%%%% %%%%,%%%%% % %%%%%%%%%%%%%%%%%%%%% %.%-%,%,%2%1%,%,%,%% %%% % %%% % % w% u% w% u% % % T%P%P%P%W% Some Text %% %<%<%<%<%D%C%<%<%<%'% % % % w% % u% % %%%%4%%% % % Z%P%f%P%]% in the box %% %<%<%<%<%<%<%<%<%<%&% % % %%$% % v%%4%%%%%%t% % % ^%P%d%P%P%i%P%P%d%P%P%P%P%P%P%P%P%P%P%P%a%% "%E%<%<%<%<%<%<%<%F%*% \_wrong in % %%t% % % %%%%%%,%%%%$% % % %%%,%%%$% %% !%C%<%<%<%<%<%<%<%D%)% / X 9x15 font % v%%,%%$% %%% v%%% % %%% % % %%%4%%%% %% %<%F%H%E%<%<%A%<%<%$% %%% % u% %%%%%%$% % u% v%%$% %%%%%%%%%%%%%%%%%%%%%% %<%J%K%I%<%>%K%=%<%$% % % %%% % %%t% % %%,%%% % %%%%%%%%%%%%%%%%%%%%% %<%D%G%C%<%<%@%<%<%$% % %%% %%% % v%%4%%% % u% % %<%<%<%<%F%E%<%<%<%$% % v%%4%%%%%%4%%%%t% u% % v%%$% %6%5%4%4%:%9%4%4%4%% %%%%%%%%%%%%%%%%%%4%%%%% Matching Sets Geometric Shapes in U+2500: Squares: % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % Triangles: % % % % % % % % % % % % % % % % % % % % % % % % % # # # # # # # Diamonds: % % % Circles: % % % % % % % % % % % % % % % % % % % % %% % Rectangles: % % % % Note these do not connect to each other, but can be good for framing... Includes simularly matching chars are from other sets... % % %% %% # # # # # # # #   "# % # #  @ # % % %% %% # # # # # # # #   ## % # #  ? # Wrong in X misc 9x15 font %%%o% w% CJK or blank in Truetype fonts %%o% Other Examples... https://en.wikipedia.org/wiki/Box-drawing_character https://www.vidarholen.net/cgi-bin/labyrinth?w=13&h=13 http://xahlee.info/comp/unicode_drawing_shapes.html http://tamivox.org/dave/boxchar/index.html http://clubmate.fi/using-pseudographics-in-blogposts-drawing-ascii-diagrams-and-boxes/ Using Box drawing with other unicode sets... Warning: box lines do not always work with other shapes. But should! Obviously font designers do not care about box drawing fonts all that much! As such they really only work in fixed width terminals, like xterms 9& w% '" " %%%o%m%%%% q% r% q% "<%%%%%%%%#%#%%" "%%" "%% "%% %%P%P%%%%% u% (" " !P%P%P%! "P%P%P% "P%P%"P%P%R"Q"S"P"W"P%P%P%P%T"P%P%V"P%P%U"P%P%a"P%P%c"P%P% 'P%P%P%' Longer prop font double line arrows \u27f8-9 %%%%%' P%P%P%P%P%' Right arrow heads (X windows and GTK "Monospace") ------------------------------------------------------------------------------- Horizontal Lines: m%%%%%%% overbar punctuation (underline using line below) % m%%%%%%% horizontal line extension (" (" %% > > ########## %% '" '" p%%%%%%%%%%%%%%%%%p%%%%% 1/8 block, top and bottom %%%%% \u2500 box drawing horizontal line %   % \u2015 horizontal bar (can be 'long' in some fonts) %###% \u23af horizontal line extension (fails in "Monospace") %%%%%     ##### The three together (aligned with above) Arrows using horizontal bar \u2015 (seems to be the best choice overall, though 'dashy' in some fonts) %   % %   % "   " z"   {" !   ! %   % %   % "   " "   " "   " %   % %   % '   '    '   ' Notes: \u2500 box drawing line, should work but often doesn't. \u23af horizontal line extension works well, though "chrome" replaces it. \u2015 horizontal bar, works but is very long in proportional fonts. All work perfectly for X window "misc-fixed" fonts. Only "horizontal bar" works for GTK "Monospace" and Truetype "Terminus" fonts. ------------------------------------------------------------------------------- Vertical Lines: There is a lot of vertical bars for extended brackets in the U+2300 unicode block. You should ensure you use the right one for each bracket type. See example in the "Mathematics and Sciences" section above. _used with_ # \u23b8 left box line (bad in many fonts) % \u258f 1/8 block left # \u23a2 # # left square bracket # \u239c # # left parenthesis # \u23aa # # # # # # curly braces (extension bad in GTK) % \u2502 # % % % box drawing vertical line (see above) # \u23ae # !# intergral sign # \u239f # # right parenthesis # \u23a5 # # right square bracket % \u2595 1/8 block right # \u23b9 right box line (bad in many fonts) # \u23d0 vertical line extension (missing in X window fonts) % % % % " " '" " " ' % <-- \u2503 box vert line bold % % % % % % % % % % % <-- \u2502 box vert line % % % % " " (" " " ' Q% <-- \u2551 box vert line doubled Just about all the centered vertical lines work for vertical arrows. No problems for X windows "fixed" font (as always). GTK "Monospace" works okay, except for '"' '"' '#' It also loses prefix spaces for many characters in "gedit" But "Terminus" has proportional width faults unless you are using "gvim" which 'squares' all characters anyway. ------------------------------------------------------------------------------- Upside Down Letters: NB: Characters are from all over the unicode, some are not always available ZD!XM')""SOWB!"IHA!2!%!" zxnsybdouo~ e_pTqP 068%19#1     K! Alternatives i -> 1  l ->   2 -> Z  G -> A!  . -> (  , -> '   Example Phrases P lPysn" ypun uop ooy_ su yA! ypun uop ooy_ Pp,PA! s qooz no P e H un_ PH s1qooz no P e1u pooA! Converters... http://www.upsidedowntext.com/ https://fsymbols.com/generators/aboqe-flip/ https://www.fileformat.info/convert/text/upside-down.htm https://www.fileformat.info/convert/text/upside-down-map.htm http://xahlee.info/comp/unicode_invert_text.html Other types of text substitution converters.... https://fsymbols.com/generators/wavy https://fsymbols.com/generators/zalgo https://fsymbols.com/generators/carty/ https://fsymbols.com/generators/smallcaps/ ... =============================================================================== Unicode Block Tables... Spacing Modifier Letters U+02B0                                                                                 Tolkan Runes U+16A0                                                                                 --- Punctuation U+2000                 ! " # $ % & ' 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ Superscripts & Subscripts U+2070 p q t u v w x y z { | } ~  --- Arrows U+2190 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! Dingbat Arrows (U+2790) ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' Supplement-A (U+27F0) ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' Supplement-B (U+2900, not in X fonts) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) !) ") #) $) %) &) ') () )) *) +) ,) -) .) /) 0) 1) 2) 3) 4) 5) 6) 7) 8) 9) :) ;) <) =) >) ?) @) A) B) C) D) E) F) G) H) I) J) K) L) M) N) O) P) Q) R) S) T) U) V) W) X) Y) Z) [) \) ]) ^) _) `) a) b) c) d) e) f) g) h) i) j) k) l) m) n) o) p) q) r) s) t) u) v) w) x) y) z) {) |) }) ~) ) Arrows from other sets...   ! ! ! ! # z" {" % % % % % % % % " " " " % % % % % % % % Diacritical Arrows... | Protect from end of line space removal See "Combining Characters" below for more info --- Mathematical U+2200: " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " !" "" #" $" %" &" '" (" )" *" +" ," -" ." /" 0" 1" 2" 3" 4" 5" 6" 7" 8" 9" :" ;" <" =" >" ?" @" A" B" C" D" E" F" G" H" I" J" K" L" M" N" O" P" Q" R" S" T" U" V" W" X" Y" Z" [" \" ]" ^" _" `" a" b" c" d" e" f" g" h" i" j" k" l" m" n" o" p" q" r" s" t" u" v" w" x" y" z" {" |" }" ~" " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " Math Supplemental U+2A00 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * --- Technical U+2300: # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # !# "# ## $# %# &# '# (# )# *# +# ,# -# .# /# 0# 1# 2# 3# 4# 5# 6# 7# 8# 9# :# ;# <# =# ># ?# @# A# B# C# D# E# F# G# H# I# J# K# L# M# N# O# P# Q# R# S# T# U# V# W# X# Y# Z# [# \# ]# ^# _# `# a# b# c# d# e# f# g# h# i# j# k# l# m# n# o# p# q# r# s# t# u# v# w# x# y# z# {# |# }# ~# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # --- Miscellaneous U+2400: $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ !$ "$ #$ $$ %$ &$ @$ A$ B$ C$ D$ E$ F$ G$ H$ I$ J$ `$ a$ b$ c$ d$ e$ f$ g$ h$ i$ j$ k$ l$ m$ n$ o$ p$ q$ r$ s$ t$ u$ v$ w$ x$ y$ z$ {$ |$ }$ ~$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ --- Graphics U+2500: % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % !% "% #% $% %% &% '% (% )% *% +% ,% -% .% /% 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% :% ;% <% =% >% ?% @% A% B% C% D% E% F% G% H% I% J% K% L% M% N% O% P% Q% R% S% T% U% V% W% X% Y% Z% [% \% ]% ^% _% `% a% b% c% d% e% f% g% h% i% j% k% l% m% n% o% p% q% r% s% t% u% v% w% x% y% z% {% |% }% ~% % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % % --- Miscellaneous Symbols U+2600: & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & & !& "& #& $& %& && '& (& )& *& +& ,& -& .& /& 0& 1& 2& 3& 4& 5& 6& 7& 8& 9& :& ;& <& =& >& ?& @& A& B& C& D& E& F& G& H& I& J& K& L& M& N& O& P& Q& R& S& T& U& V& W& X& Y& Z& [& \& ]& ^& _& `& a& b& c& d& e& f& g& h& i& j& k& l& m& n& o& --- Dingbats U+2700: Many of the original are defined elsewhere: ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' !' "' #' $' %' &' '' (' )' *' +' ,' -' .' /' 0' 1' 2' 3' 4' 5' 6' 7' 8' 9' :' ;' <' =' >' ?' @' A' B' C' D' E' F' G' H' I' J' K' M' O' P' Q' R' S' T' U' V' W' X' Y' Z' [' \' ]' ^' a' b' c' d' e' f' g' h' i' j' k' l' m' n' o' p' q' r' s' t' u' v' w' x' y' z' {' |' }' ~' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' ' --- Braille U+2800: ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( !( "( #( $( %( &( '( (( )( *( +( ,( -( .( /( 0( 1( 2( 3( 4( 5( 6( 7( 8( 9( :( ;( <( =( >( ?( @( A( B( C( D( E( F( G( H( I( J( K( L( M( N( O( P( Q( R( S( T( U( V( W( X( Y( Z( [( \( ]( ^( _( `( a( b( c( d( e( f( g( h( i( j( k( l( m( n( o( p( q( r( s( t( u( v( w( x( y( z( {( |( }( ~( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( ( Character Code (in hex) = U+2800 + 1 8 2 10 4 20 40 80 so the lower four dots is (in hex) = U+2800 + 40 + 80 + 4 + 20 => U+28E4 => ( NOTE: Almost all Truetype fonts (except "Terminus") uses circles and dots, rather than just dots as such the first 'Braille Space' glyph is not blank! Monospace for example does this. --- Full Width Characters: U+FF10 These are the same as per Chinese/Japanese glyphs, typically used with these glyphs. They are not defined in standard X window fonts (neither are asian glyphs)           ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : A B C D E F G H I J K L M N O P Q R S T U V W X Y Z        @ > ^ ?        ; = [ ] _ ` \  <  --- Miscelanious X window defined glyphs.. These are often not defined the same in other fonts                                               ! " # $ % & ' ( ) * + , - . /              ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 8 9 : ; < > @ A C D F G H I J K L M N O V W X Y j k l m z { | } p r t v x z | } ~ Greek Alphabet               Specials Block U+FFF0: \uFFF9 Annotation Anchor \uFFFA Annotation Separator \uFFFB Annotation Terminator \uFFFC Replacement Object (placeholder for unspecified document) \uFFFD Replacement character (the official not-defined character) \uFFFE, \uFFFF not a character (generally something is wrong!) The most important character in this block is \uFFFD And is rendered as a filled diamond with question mark. Used to indicate a problem within the Unicode stream, such as display a windows code page as Unicode. =============================================================================== Combining Characters... Diacritical Marks, are characters that accent the previous character. Generally you have a main character then a combining character which overlays on the previous character. Some characters are pre-combined to provide direct compatibility with the older ISO8859 fonts. A + Diaeresis (u0308): A PreCombined (u00C4): Combining Characters tend to fail in unexpected ways. With marks appearing over the next character (Chrome), or not centered over/below the previous character. XTerms seem to work the best. Note that the Thai Script needs up to two combining characters over a single base character. Examples... STARG TE SG-1, a = v = r, a " b | Protect - | from end of line . " " " " " % X s% ' v " "N " - ". " . " , %,   | space removal Diacritical Mark Blocks (formatted over a space) Combining Diacritical Marks U+0300 - U+036F             a   | Protect from end of line space removal 4 5 6 7 8 | as these are all combined with a space! 1 2 - , / . N b \ | Non-combining Diacritical Marks U+02B0 - U+02FF        Punctuation Non-combining simular glyphs > # @ ? # Combining Diacritical Marks for Symbols U+20D0 - U+20FF | Protect from end of line space removal | Works in "vim" but in little else Variation Selector... When fonts contain both Text and Emoji variants, some symbols are in both The symbols generally have a preference for what it should be displayed as. Example... ! is a math symbol. But some web browsers will prefer to use the emoji variant! That means some mathematical formulas simply do not render as it was originally intended. "Variation Selector" is a invisible character. It indicates a rendering preference for the character before it. This is generally needed for web rendering, in terminals the indicator is not understood at this time and comes out as a unknown composing character. U+FE0E indicator for text rendering Example: ! U+FE0F indicator for emoji rendering Example: ! See http://xahlee.info/comp/text_vs_emoji.html ----------------------------------------------------- Language Examples... APL: ((Vs#V)=s#t#V)/V!,V 7#!s#!t#"""> N#U## Linguistics and dictionaries: i 1ntYnYnYl fYn[t1k Ysosie1n Y [psilTn], Yen [j[n], Yoga [jogQ] Some Chinese (double width characters) KmՋ(uvIlW[ Greek (in Polytonic): From a speech of Demosthenes in the 4th century BC: Pv Pp ww  }, f  , E  0 p q s v E x z y S { z r p y v  u w A s, p r q 0  u, e E t y Pv y  s s. Ps V    1 p  s " t Qy, v ' {, Pv t V q Q q. | s, E s    y v p Q   v w u, v q  6   q, P q s  y  s s  1x  ! 6 t }, E z q }. p p  w Qq, y v v  w uw  v C y s  v r t t @ Qs, q ! v   A  y. s,  x Georgian: From a Unicode conference invitation:     Unicode-    ,   10-12 , . , .           Unicode-,   , Unicode-   ,   , ,      . Russian: From a Unicode conference invitation: 0@538AB@8@C9B5AL A59G0A =0 5AOBCN 564C=0@>4=CN >=D5@5=F8N ?> Unicode, :>B>@0O A>AB>8BAO 10-12 <0@B0 1997 3>40 2 09=F5 2 5@<0=88. >=D5@5=F8O A>15@5B H8@>:89 :@C3 M:A?5@B>2 ?> 2>?@>A0< 3;>10;L=>3> =B5@=5B0 8 Unicode, ;>:0;870F88 8 8=B5@=0F8>=0;870F88, 2>?;>I5=8N 8 ?@8<5=5=8N Unicode 2 @07;8G=KE >?5@0F8>==KE A8AB5<0E 8 ?@>3@0<<=KE ?@8;>65=8OE, H@8DB0E, 25@AB:5 8 <=>3>O7KG=KE :><?LNB5@=KE A8AB5<0E. Thai (UCS Level 2): Excerpt from a poetry on The Romance of The Three Kingdoms (a Chinese classic 'San Gua'): [----------------------------|------------------------] O AH4.1H@*7H-!B#!A**1@'  #0@(-9J9I6IC+!H *4*-)1#4"LH-+I2A%1D *--LD #IBH@%2@21  2 #17-15@G5H6H I2@!7-6'4#4@G1+2 B.4K@#5"11H'+1'@!7-!2 +!2"0H2! 1H'1'*31  @+!7-1D*D%H@*7-2@+2 #1+!2H2@I2!2@%"-2*1  H2"-I--8I"8A"C+IA1 C I*2'1I@G ' 7H 'C %1%4 8"8"5%1H-@+8 H2-2@(#4+2I2#I-D+I I-##2H21##%1" $E+2C#I3 99I##%1L / (The above is a two-column text. If combining characters are handled correctly, the lines of the second column should be aligned with the '|' character above.) Ethiopian: Proverbs in the Amharic language: 0 s(5  % 05b e   ct `F b % dq A% b  `  Ed c # #u `b M s `Ed s=b % ``   psb 2p(  ( b @5 `@5e A   ` )  b - be- `3 5-b 0 dq   (dq p-b  - Hp .. 3  -b (du ce bu 5E cu   Eb %+ Msu  Ksub c * e   + b 5  )  + ) -b p  bpI p 6 cIb  - b (-5 u 0b  - `M+=  - b Runes:             (Old English, which transcribed into Latin reads 'He cwaeth that he bude thaem lande northweardum with tha Westsae.' or translated to modern english 'He said that he lived in the northern land near the Western Sea.') Braille: L(('(( <((( M((((9(0(( c(( ( M((((9( :((( ((((( (( (((( :( (9(2( y(;(( (( (( (3((( 1(((('(;( ((3(( 9(((2( y(( ((( ( (;( ( ( ( (( (%(( ((( :((( ( (((+( (9( 9(( ((;((9( (((( 9(( ((;((( 9(( %(((;((((;(( ((( 9(( !( (( ( (3(((;(2( N( (((((( ( (((+( ((2( A((( N( ((((((0(( (( (( :((( (((( %(((( 0(a(((((( ((( ((9(9((( (( !(((( (( (%(( ( (( (((( ((2( (The first couple of paragraphs of "A Christmas Carol" by Dickens) Greetings in various languages: Hello world, s y, 00000 =============================================================================== Simple Unicode Line Art... http://xahlee.info/comp/unicode_smilies.html These often make big use of Diacritical Marks and as such often fail in spectacular ways. (a\a) "Lenny Face" ( a \ a) For GTK fonts and web browsers (proportional fonts) ( a~ \ a) Lenny Wink (' \ ') Goggly Eyes \_(0)_/ The double-wide face fails in XTerms d2a2r2w2i2s2W e2v2o2l2u2t2i2o2s2W Darwin evolution fish %)"%(%_%) %)"% Monster (A5A`) Sad (_) Meh ( v ) Angry " a1" a Face (not in xterm, gedit, or chrome) (%/? \%) Up Face Circle of Hats (chrome works, not in xterm) - lost __4144!!! !Ll!!! !Ll!*!! !4144! !!a|2a2a2a 2%a2 2a2a22a2a 2a2%2a2a 2|!!! 14!! !Ll!!!!.___ Landscape (XTerms only, not Gedit or chrome) -(`.)-> Arrow in Heart [{-_-}] ZZZzz zz z... Sleep !%! Cat " %"  Bear  R%R R%R pedobear (look right, left) (e&_e&) Love Eyes "(0'? 0')d0 Star Eyes \(AAN` )/ Yea! () big eyes " . H" Racing Car !e "e Love You Script [2$2(22)2$2] Money (xterm only) ~(> %> )~ Bird -`- Sparklingly heart ;f%55G????P%P%d%% Rifle .I'0F'0*.C' 0F' snowflake line ------------------------------------------------------------------------------- More complex Unicode Art (small collection) http://xahlee.info/comp/unicode_ascii_art.html " %%%%%%%%%%%%%%%%%%%%%% q%+.r% $ $ $ $ $ $ %$ $ $ $ $ $ $ =&$ %%%%%%%%%%%%%%%%%%%%%% q%+.*.+.r% $ $ $ $ $ q%%r%$ $ &$ $ $ $ $ %%%%%%%%%%%%%%%%%%%%%% q%+.*.+.*.+.r% $ $ $ q%%%%%%r%$ $ $ $ $ $ %%%%%%%%%%%%%%%%%%%%%% q%+.*.+.*.+.*.+.r% $ $ $ r%%%%%%q%$ $ $ $ q%r% %%%%%%%%%%%%%%%%%%%%%% q%+.*.+.*.+.*.+.*.+.r% $ $ $ $ $ $ $ $ $ $ $ q%r%q% % % %%%%%%%%%%%%%%%%%%%%%% q%+.*.+.*.+.*.+.*.+.*.+.r% $ q%r%$ $ $ $ $ $ $ q% % %r% % % " %%%%%%%%%%%%%" %%%%%%%%%%%%%%%% $ $ $ $ $ %%%%%%%%$ $ $ $ $ $ $ %%%%%%%$ $ $ $ $ r%%%%%q%r%%%%%q%r%%%% $ $ $ q%r%r%%%%%%$ %$ $ $ $ $ $ $ %%%%%%%%$ $ $ $ %r%%%%q%%r%%%%q%%r%%% $ $ q%$ $ r%$ $ $ $ $ %%$ $ $ $ $ %%%%%n%m%%%%%$ $ $ %%r%%%q%%%r%%%q%%%r%% $ $ %%%%$ $ $ $ %%%%$ $ $ $ $ r%%r%%%%%%q%$ $ $ $ %%%r%%q%%%%r%%q%%%%r% $ $ $ %%$ $ $ $ $ r%$ $ q%$ $ $ $ $ $ $ %%q%%r%%$ $ $ $ $ %%%%r%q%%%%%r%q%%%%% $ $ $ %$ %%%%%r%r%q%$ $ $ $ $ $ $ %%r%%%%%%$ $ $ $ %%%%%%%%%%%%%%%% $ $ $ %%%%%%%%$ $ $ $ $ Q%%Q%%Q%Q%%Q%%Q%%Q%Q%%Q%%Q%Q%%Q%%Q%%Q% Q%%Q%%Q%Q%%Q%%Q%%Q%Q%%Q%%Q%Q%%Q%%Q%%Q% Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q%Q% Z%i%i%i%i%i%i%i%i%i%i%i%i%i%i%i%i%i%i%i%i%i%i%]% -------------------------------------------------------------------------------