Members Bookmarks Fresher Jobs Funny Pictures MCA Projects New Member FAQ  



My Profile
Active Members
TodayLast 7 Days more...



Awards & Gifts
Online Exams

Fresher Jobs


Our fresher job section is exclusively for fresh graduates! Find jobs for freshers in major Indian cities including Bangalore, Chennai, Hyderabad, Pune or Kochi

Resources


Find educational articles, blogs, discussion threads and other resources.

Colleges


Find details about any college in India or search for courses.

Paid Surveys


website counter



Uni code


Posted Date: 04 Mar 2008    Resource Type: Articles/Knowledge Sharing    Category: General

Posted By: Aparanjitha       Member Level: Gold
Rating:     Points: 4



3.1 Unicode
Programs are written using the Unicode character set. Information about this character set and its associated character encodings may be found at:

http://www.unicode.org

The Java platform tracks the Unicode specification as it evolves. The precise version of Unicode used by a given release is specified in the documentation of the class Character.

Versions of the Java programming language prior to 1.1 used Unicode version 1.1.5. Upgrades to newer versions of the Unicode Standard occurred in JDK 1.1 (to Unicode 2.0), JDK 1.1.7 (to Unicode 2.1), J2SE 1.4 (to Unicode 3.0), and J2SE 5.0 (to Unicode 4.0).

The Unicode standard was originally designed as a fixed-width 16-bit character encoding. It has since been changed to allow for characters whose representation requires more than 16 bits. The range of legal code points is now U+0000 to U+10FFFF, using the hexadecimal U+n notation. Characters whose code points are greater than U+FFFF are called supplementary characters. To represent the complete range of characters using only 16-bit units, the Unicode standard defines an encoding called UTF-16. In this encoding, supplementary characters are represented as pairs of 16-bit code units, the first from the high-surrogates range, (U+D800 to U+DBFF), the second from the low-surrogates range (U+DC00 to U+DFFF). For characters in the range U+0000 to U+FFFF, the values of code points and UTF-16 code units are the same.

The Java programming language represents text in sequences of 16-bit code units, using the UTF-16 encoding. A few APIs, primarily in the Character class, use 32-bit integers to represent code points as individual entities. The Java platform provides methods to convert between the two representations.

This book uses the terms code point and UTF-16 code unit where the representation is relevant, and the generic term character where the representation is irrelevant to the discussion.

Except for comments (§3.7), identifiers, and the contents of character and string literals (§3.10.4, §3.10.5), all input elements (§3.5) in a program are formed only from ASCII characters (or Unicode escapes (§3.3) which result in ASCII characters). ASCII (ANSI X3.4) is the American Standard Code for Information Interchange. The first 128 characters of the Unicode character encoding are the ASCII characters.





Responses


No responses found. Be the first to respond and make money from revenue sharing program.

Feedbacks      
Popular Tags   What are tags ?   Search Tags  
(No tags found.)

Post Feedback


This is a strictly moderated forum. Only approved messages will appear in the site. Please use 'Spell Check' in Google toolbar before you submit.
You must Sign In to post a response.
Next Resource: Funny
Previous Resource: 100 ways to make you smile
Return to Discussion Resource Index
Post New Resource
Category: General


Post resources and earn money!
 
Related Resources


Contact Us    Privacy Policy    Terms Of Use   

SpiderWorks Technologies Pvt Ltd. 2006 - 2007 All Rights Reserved.