ifx_gl_tomupper,
ifx_gl_tomlower,
ifx_gl_towupper,
ifx_gl_towlower
- convert case of one character
SYNOPSIS
#include <ifxgls.h>
unsigned short ifx_gl_tomupper(gl_mchar_t *dst_mb, gl_mchar_t *src_mb, int src_mb_byte_limit);
unsigned short ifx_gl_tomlower(gl_mchar_t *dst_mb, gl_mchar_t *src_mb, int src_mb_byte_limit);
gl_wchar_t ifx_gl_towupper(gl_wchar_t src_wc);
gl_wchar_t ifx_gl_towlower(gl_wchar_t src_wc);
DESCRIPTION
These functions return the alphabetic case-equivalent of the source character, or
return the source character if it does not have a case-equivalent.
For ifx_gl_tomupper(), if src_mb has an upper-case equivalent
character, that upper-case character is copied to dst_mb; otherwise
src_mb is copied to dst_mb unchanged.
For ifx_gl_tomlower(), if src_mb has a lower-case equivalent
character, that lower-case character is copied to dst_mb; otherwise
src_mb is copied to dst_mb unchanged.
If src_mb_byte_limit is IFX_GL_NO_LIMIT then these functions
will read as many bytes as necessary from src_mb to form a complete
character; otherwise, they will not read more than src_mb_byte_limit bytes
from src_mb when trying to form a complete character.
See Multi-Byte Character Termination for more general information about src_mb_byte_limit.
For ifx_gl_towupper(), if src_wc has an upper-case equivalent
character, that upper-case character is returned; otherwise
src_wc is returned unchanged.
For ifx_gl_towlower(), if src_wc has a lower-case equivalent
character, that lower-case character is returned; otherwise
src_wc is returned unchanged.
RETURN VALUES
The functions ifx_gl_towupper() and ifx_gl_towlower() return the case-equivalent
character.
The functions ifx_gl_tomupper() and ifx_gl_tomlower() return an unsigned
short integer which encodes the number of bytes read from src_mb and
the number of bytes written to dst_mb.
To determine the number of bytes read from src_mb, pass the value returned
by these multi-byte functions to the macro IFX_GL_CASE_CONV_SRC_BYTES(x).
To determine the number of bytes written to dst_mb, pass the return value
to the macro IFX_GL_CASE_CONV_DST_BYTES(x). For example,
src_mb = src_mbs;
dst_mb = dst_mbs;
while ( *src_mb != '\0' )
{
unsigned short retval = ifx_gl_tomupper(dst_mb, src_mb, src_mbs_bytes);
src_mb += IFX_GL_CASE_CONV_SRC_BYTES(retval);
dst_mb += IFX_GL_CASE_CONV_DST_BYTES(retval);
src_mbs_bytes -= IFX_GL_CASE_CONV_SRC_BYTES(retval);
}
ERRORS
If an error has occurred, ifx_gl_tomupper() and ifx_gl_tomlower() return 0 and write nothing to dst_mb, and ifx_gl_towupper() and ifx_gl_towlower() return
src_wc; then
ifx_gl_lc_errno() returns
one of the following,
- [IFX_GL_EILSEQ]
-
src_mb is not a valid multi-byte character (or src_wc is not a
valid wide-character)
- [IFX_GL_EINVAL]
-
The function cannot determine whether src_mb is a valid multi-byte
character, because it would need to read more than src_mb_byte_limit bytes
from src_mb. If src_mb_byte_limit is less than or equal to zero, this function
always gives this error.
See Keeping Multi-Byte Strings Consistent for more information about this error.
Since these functions do not return a special value
if an error has occurred, to detect an error condition, the caller of these functions must set
ifx_gl_lc_errno()
to zero before calling them and check ifx_gl_lc_errno() after calling them. For example,
ifx_gl_lc_errno() = 0;
dst_wc = ifx_gl_towupper(src_wc);
if ( ifx_gl_lc_errno() != 0 )
/* Handle error */
else
...
MEMORY MANAGEMENT
Determining the size of the multi-byte destination buffer
The number of bytes written to dst_mb might be more or less than the
number of bytes read from src_mb. There are three ways to determine
the number of bytes that will be written to dst_mb.
The function
ifx_gl_case_conv_outbuflen(src_mb_bytes)
calculates either exactly the number of bytes that will be written to
dst_mb or a close over-approximation of the number. This function
applies to both upper-case and lower-case conversions. The second
argument to
ifx_gl_case_conv_outbuflen()
is the number of bytes in the character src_mb.
The function
ifx_gl_mb_loc_max()
calculates the maximum number of bytes that will be written to dst_mb for any value
of src_mb in the current locale.
This value will always be equal
to or greater than the value returned by
ifx_gl_case_conv_outbuflen(src_mb_bytes)
The macro
IFX_GL_MB_MAX
is the maximum number of bytes that will be written to dst_mb for any value
of src_mb in any locale.
This value will always be equal
to or greater
than the value returned by
ifx_gl_mb_loc_max().
Of the three options, the macro
IFX_GL_MB_MAX
is the fastest and the only one
that can be used to initialize static buffers. The function
ifx_gl_case_conv_outbuflen(src_mb_bytes)
is the slowest, but the most precise.
Case converting wide-characters in-place
Case conversion of wide-characters can always be done in-place. For
example, the case-equivalent of src_wc can be assigned back to src_wc,
src_wc = ifx_gl_towupper(src_wc);
Case converting multi-byte characters in-place
Case conversion of multi-byte characters cannot always be done in-place.
If the value returned by
ifx_gl_case_conv_outbuflen(src_mb_bytes)
is not equal to src_mb_bytes, then case conversion cannot be done in-place;
a separate destination buffer must be allocated.
However, if the value returned by
ifx_gl_case_conv_outbuflen(src_mb_bytes)
is exactly equal to src_mb_bytes, then case conversion can be done in-place.
For example,
src_mb_bytes = ifx_gl_mblen(src_mb, ...);
dst_mb_bytes = ifx_gl_case_conv_outbuflen(src_mb_bytes);
if ( dst_mb_bytes == src_mb_bytes )
{
retval = ifx_gl_tomupper(src_mb, src_mb);
}
else
{
dst_mb = (gl_mchar_t *) malloc(dst_mb_bytes);
retval = ifx_gl_tomupper(dst_mb, src_mb);
}
Case converting multi-byte strings
All of the above discussion of memory management when case converting a single
multi-byte character also applies to converting a string of one or more multi-byte characters. For example,
/* Assume src_mbs0 is null-terminated */
src_mbs_bytes = strlen(src_mbs);
dst_mbs_bytes = ifx_gl_case_conv_outbuflen(src_mbs_bytes);
if ( dst_mbs_bytes == src_mbs_bytes )
{
src_mb = src_mbs;
while ( *src_mb != '\0' )
{
retval = ifx_gl_tomupper(src_mb, src_mb);
src_mb += IFX_GL_CASE_CONV_SRC_BYTES(retval);
}
}
else
{
dst_mbs = (gl_mchar_t *) malloc(dst_mbs_bytes + 1);
src_mb = src_mbs;
dst_mb = dst_mbs;
while ( *src_mb != '\0' )
{
retval = ifx_gl_tomupper(dst_mb, src_mb);
src_mb += IFX_GL_CASE_CONV_SRC_BYTES(retval);
dst_mb += IFX_GL_CASE_CONV_SRC_BYTES(retval);
}
*dst_mb = '\0';
}
PERFORMANCE
Since these functions assign the destination character regardless of
whether the source character has a case-equivalent character,
dst_wc = ifx_gl_towlower(src_wc);
is the same as the conventional,
if ( ifx_gl_iswupper(src_wc) )
dst_wc = ifx_gl_towlower(src_wc);
else
dst_wc = src_wc;
but the first is usually faster.
SEE ALSO
ifx_gl_ismupper()
ifx_gl_ismlower()
ifx_gl_iswupper()
ifx_gl_iswlower()
ifx_gl_mb_loc_max()
ifx_gl_case_conv_outbuflen()
ACKNOWLEDGEMENT
Portions of this description were derived from the X/Open CAE
Specification: "System Interfaces and Headers, Issue 4"; X/Open
Document Number: C202; ISBN: 1-872630-47-2; Published by X/Open Company
Ltd., U.K.
|