So this is just a lil test I whipped up last night because Tatsujin wants a Japanese port of Monolith... it'd be much easier to simply use the system card font than to try to do bitmap fonts for everything. Of course, HuC doesn't make this particularly easy, but luckily, since it allows for inline assembly, you can still do it with ease.

lda #$82 ; the high byte of the sjis code
sta <_ah
lda #$60 ; the low byte of the sjis code
sta <_al
lda #low(_bytebuffer)
sta <_bl
lda #high(_bytebuffer)
sta <_bh
lda #$00 ; set to $00 for 16x16 or $01 for 12x12
sta <_dh
jsr ex_getfnt
sta _returnval
bytebuffer is: char bytebuffer[32];
return val is: char returnval;
Both are globals for speed.
You can easily turn this code into a function by giving _ah, _al, and _dh variables instead of absolutes like I've done here. bytebuffer will contain the data of the one-bit-per-pixel character stored in the system card BIOS in either 16x16 or 12x12 size, depending on how you set _dh. If returnval is 1, then the call failed. You need to give it a proper Shift-JIS code to work. Decoding the buffer is easy too; it's a simple left-to-right pixel arrangement, with the highest bit containing the first pixel (decimal 128), the second highest bit containing the next pixel (decimal 64), etc. This goes on for 16 pixels, then goes on to the next line. So, bytebuffer[0] and bytebuffer[1] contain the pixels for the first line, bytebuffer[2] and bytebuffer[3] contain the pixels for the second line, etc etc etc until done.
