PCEngineFans.com - The PC Engine and TurboGrafx-16 Community Forum 
		Tech and Homebrew => Turbo/PCE Game/Tool Development => Topic started by: touko on August 31, 2009, 09:10:48 PM
		
			
			- 
				Hi all , the question is in title !!  :roll:
 
 I 'am selecting the control register :
 poke(0x0000,0x05);
 
 I'am putting 0xfd in :
 poke(0x0002,0xfd);
 
 Now if i'am reading the content of $0000
 put_hex(peekw(0x0000),4,4,9);
 
Result is always 4000, collision within sprite #0 and other sprites is not effective ..
 
 Thanks for help ..
- 
				
 Unless you have a really good reason to use sprite #0 hit detection, I would highly recommend not to use it. That aside, don't poll/read port $0000 directly during active display. You can cause the system to miss a VDC interrupt (vblank or hblank). There's a variable for status register of the VDC that you should check/read from. This is updated on every VDC interrupt. Though... HuC might erase the sprite collision flag if another interrupt happens after it, but before you have a chance to read from it. I'll have to look to see what HuC is doing for interrupt code (or you could check in mednafen debugger too).
- 
				ok, thanks tom ..
 
 I'am understanding why i can't read it  :-k ..  :^o
 My first reason to use hardware collision is for tests first, and to use it in the future to test collisions between player ship and aliens/bullets sprites.
 
 For debug i'am not familiar with mednafen's debugger  :oops:
 
 Thanks.
- 
				huc most likely jacks the interupts up and out from under you.
 
 as for mednafen debugger: alt-d to enter debug mode
 
 http://mednafen.sourceforge.net/documentation/debugger.html
 
 and click that to know how to use it!
 
 
 i wouldn't even worry about sprite 0 collision detection.  Its a neat thought, but you can set up and track bounding boxes and be just as well off as sprite 0 collision detection.
 
 
- 
				Thanks arkhan  :wink:..
 
 Without sprite #0 collision, you must test all the time if all sprites, or all possible sprites collide with your ship, and some cpu time are lost .
 
 Why not starting to check player ship collision, when a sprite #0 collision interuption was occured ???
- 
				I guess I could see that being more beneficial if enemies cant collide, but you can do some tricks with non sprite 0 collision checks that dont kill the CPU much at all.
 
 I at most check 22 sprites all against each other.  It bogs down at some point sometimes, but thats checking EVERY sprite against EVERY sprite..... :)
 
 you could do something like, check the two sprites and if one isnt the player, just return.  Dont do any stuff.  That way when you loop thru to say an enemy vs an enemy, it just returns.
 
 i hope that even makes sense.  IM awful at explaining things.
- 
				It also makes a difference if you're using arrays to hold the data. That will cut into CPU time a LOT. I think it was Tom that discovered that array access is like...138 cycles or so each. It's really expensive in any event. There are other ways to do it that are less expensive...
			
- 
				I guess I could see that being more beneficial if enemies cant collide, but you can do some tricks with non sprite 0 collision checks that dont kill the CPU much at all.
 
 I at most check 22 sprites all against each other.  It bogs down at some point sometimes, but thats checking EVERY sprite against EVERY sprite..... :)
 
 you could do something like, check the two sprites and if one isnt the player, just return.  Dont do any stuff.  That way when you loop thru to say an enemy vs an enemy, it just returns.
 
 i hope that even makes sense.  IM awful at explaining things.
 
 
 yes, but for testing if all enemy colide with player sprite,you must test all x/y coordinate and boundary boxes(or other technique) for all possible sprites, every time ..
 
 For galaxian, software collisions detection is relatively simple ,because enemies are in lines posotioned , but for a shmup for exemple, this is not optimal.
 
 And like tom have said, arrays acces is very slow on HUC.
- 
				For a shmup, the #0 collision could be useful if your ship is sprite #0. That would eliminate a lot of other checks. I wonder how many games, if any, do this?
			
- 
				
 yes, but for testing if all enemy colide with player sprite,you must test all x/y coordinate and boucing boxes(or other technique) for all possible sprites, every time ..
 
 For galaxian software collision detection is relatively simple ,because ennemy are in lines , but for a shmup for exemple, this is not optimal.
 
 And like tom have said, arrays acces is very slow on HUC.
 
 
 The problem with sprite #0 collision; 1) you don't know what other sprite(s) collided with the this sprite so you still have to check which sprite collided, 2) it gives pixel perfect collision; so if you use it as you ship, you're going to piss people off. Collision detection like that is super-super rare and mostly makes gameplay frustrating and/or impossible.
 
 Boundary boxes are the way to go. You only need a handful of compares per object-to-object check. In ASM, that's pretty light work. Object to map is a little more taxing, depending what kind of collision shapes you use on the map.
 
 Yeah, array (pointer) access in HuC is broken for static mapped ram access. Far access is still overly slow for no real apparent reason, but that's nothing to do with static mapped ram. There might be away to speed it a some, but it would require a function call for every access.
- 
				1) you don't know what other sprite(s) collided with the this sprite so you still have to check which sprite collided
 
 
 Yes but the difference in this case is the check should be only when sprite #0 collison was occur, you go to check which sprite(s) collide with sprite #0 and no all the time ..
 
 2) it gives pixel perfect collision; so if you use it as you ship, you're going to piss people off. Collision detection like that is super-super rare and mostly makes gameplay frustrating and/or impossible. 
 You can complete that with boundary boxes . 8)
- 
				 Yeah, array (pointer) access in HuC is broken for static mapped ram access. Far access is still overly slow for no real apparent reason, but that's nothing to do with static mapped ram. There might be away to speed it a some, but it would require a function call for every access.
 
 
 function overhead sure wont be speeding it up a great deal, rofl.
 
 most things in HuC are slow.  Even things that shouldn't be.
 
 slow arrays + no structs = ._.
 
 
 everything speed wise has to be done in asm.  There is no way to do it in huc.
 
 
 well, no easy/straightforward/non-headache way.
- 
				1) you don't know what other sprite(s) collided with the this sprite so you still have to check which sprite collided
 
 
 Yes but the difference in this case is the check should be only when sprite #0 collison was occur, you go to check which sprite(s) collide with sprite #0 and no all the time ..
 
 In a general game engine you do. Because you'll have sprites that aren't interactive, that might appear over sprite #0. Like smoke, explosion, you're character's weapons, dead enemy animation, etc. And on the PCE, sprites are often used in complex background clipping for other sprites - to pop out pieces of the background over the sprites.
 
 2) it gives pixel perfect collision; so if you use it as you ship, you're going to piss people off. Collision detection like that is super-super rare and mostly makes gameplay frustrating and/or impossible. 
 You can complete that with boundary boxes . 8)
 
 
 Them it becomes nothing more than a switch in that current frame to do collision detection or not. But the situation/case is still variable in occurrences. You can't optimize for it because of the random type nature of that. I mean, how can you take advantage of that? Not to mention all the false positives you're going to get that'll keep that occurrence rate higher. For any game logic frame, you should optimize for the worst case scenario anyway - otherwise you're going to get uneven expected resource in a frame resulting in either slow down, missing frames, or whatever.
 
 Trust me, if there was a real world application for sprite #0 collision - games would be using it. We PCE long time coders would be using it. I know of no games that use it for the specific reasons I stated. PCE isn't the only system with this. Megadrive has this too and isn't used for the same reasons.
 
 The real application of it is for pixel accurate collision, which is cpu intensive. But like I said, not knowing which sprite it collides with makes it useless - unless all you have on screen is two sprites. Then you'll know which sprite collided with it ;)
 
 
 function overhead sure wont be speeding it up a great deal, rofl. 
 Pragma Fastcall function in HuC is fast. You tell it how to pass arguments. Be it one of the three registers A/X/Y or ZP pseudo regs. You can also do argument overloading - a single function can take a variable number of arguments. You tell it the size of the arguments too; int or char. No internal C stack crap. I could setup a small pointer optimization block. Maybe 32 or so pointers. 16 near pointers and 16 far pointers. You call a function to load a pointer with the address. And special functions to read/write from those pointers. At min, it would definitely be 60% faster. You could even optimize for small arrays or split arrays like 16bit, 24bit, 32bit etc.
 
 I should really do this. And while I'm at it, do a faster left/right shift function :D
 
 
 
 
 
- 
				oh god dude dont get me started on right and left shift.
 
 Here I thought I was being optimal.  I found out i was being craptimal.
 
 gahhhhh
 
 ;)
 
 
 maybe PCC will absorb some of HuC and make a new C compiler soon.  Have to see if we even feel like actually doing it as opposed to games.  Writing compilers is totally not amusing.
 
 Not with the intent to put HuC down or say its a POS.  Merely to say, were going to use it, and improve it!
- 
				Writing compilers is boring...and I know this because I have one of my own. :(
			
- 
				Writing compilers is boring...and I know this because I have one of my own. :(
 
 
 yea it started as a school project.   Then school ended and so did the motivation, lol.
- 
				yea it started as a school project.   Then school ended and so did the motivation, lol. 
 Nada on both. It's an actively developed project:
 http://www.bsdbasic.com/
- 
				yea it started as a school project.   Then school ended and so did the motivation, lol. 
 Nada on both. It's an actively developed project:
 http://www.bsdbasic.com/
 
 
 
 I meant mine, not yours. :)
- 
				I meant mine, not yours. :) Ok, I didn't know you had one too...you didn't say that. :P hehe
- 
				oh I did i just didnt specify.
 
 PCC:
 
 PC Engine C Compiler
 
 I have most of the grammar/C instruction recognition in place and it generates intermediate 6502...
 
 but it needs fine tuned.
- 
				Hi ..
 
 I have a doubt, slow array access in HUC ,are only for far data ??
 or near are so slow ??
 
 
- 
				Hi ..
 
 I have a doubt, slow array access in HUC ,are only for far data ??
 or near are so slow ??
 
 
 Far data access is sloooow in HuC. And all near data accessed via array is treated as far data. So yes to both ;)
 
 Regular "global" variables are fine/fast as long as they aren't arrays.
- 
				 Far data access is sloooow in HuC. And all near data accessed via array is treated as far data. So yes to both ;)
 
 Regular "global" variables are fine/fast as long as they aren't arrays.
 
 
 Thank's ...
 
 I 'am testing arrays access with ASM ..
 I use for a new project ,a custom fade_in fuction based on aramis's function, and there is lightly slow down when i 'am using it (caus some arrays access) with multiples parallaxes ..
 
 I try to make arrays read/write in ASM, in this function first ..
 
 The drama is  " I LIKE THAT "  :shock:
- 
				How to use an array pointer in ASM ???
 
 in C
 
 int *tab;
 
 tab = pic_pal; (tab take address of pic_pal array (who contain bgnd colors for exemple) ..
 
 My code is
 #asm
 ldx _x_val
 lda _tmp_pal,x
 sta _b2
 inx
 lda _tmp_pal,x
 sta _b2+1
 inx
 stx _x_val
 #endasm
 
tmp_pal is my array pointer, and b2 a global int ..
 And i want to load each value in tmp_pal to b2.
 
 In c an array pointer work like an array,but in ASM ...
 
 PS: array load to a global int ,work fine.
 
- 
				A few rules;
 
 Try to keep ALL variables as global in HuC. They are much faster and they are passed in function calls MUCH faster. Global Vars ftw :D
 Split your INT arrays into high/low (MSB/LSB) segments.
 Try to use direct addressing with indexing over traditional pointers of C, when possible.
 
 Ok, some examples:
 
 BYTE size array. BYTE length array.
  char array1[256];
 .
 .
 .
 array1[x] = 3;
 
is.. ldy _x
 lda #$03
 sta _array1,x
 
 
 
 WORD size array. BYTE length array.
  int array1[256];
 .
 .
 .
 array1[x] = 3;
 
is.. lda _x
 asl a
 tax
 bcs .upper
 .lower
 lda #$03
 sta _array1,x
 lda #$00
 sta _array1+$1,x
 bra .skip
 .upper
 lda #$03
 sta _array1+$100,x
 lda #$00
 sta _array1+$101,x
 .skip
 
 
 
 WORD size array. BYTE length array.
  int array1[256];
 .
 .
 .
 array1[x] = var;
 
is.. lda _x
 asl a
 tax
 bcs .upper
 .lower
 lda low(_var)
 sta _array1,x
 lda high(_var)
 sta _array1+$1,x
 bra .skip
 .upper
 lda low(_var)
 sta _array1+$100,x
 lda high(_var)
 sta _array1+$101,x
 .skip
 
 
 
 WORD size array. WORD length array.
  int array1[2048];
 .
 .
 .
 array1[x] = var;
 
is.. lda low(_x)
 asl a
 sta <_ZP
 lda high(_x)
 rol a
 sta <ZP+1
 lda #low(_array1)
 adc <_ZP           ;<- no need for carry because previous shift should clear it already. Unless you have an illegal length (because >$7fff WORDs is greater than 64kbytes)
 sta <_ZP
 lda #high(_array1)
 adc <_ZP+1
 sta <_ZP+1
 
 lda low(_var)
 sta [_ZP]
 lda high(_var)
 ldy #$01
 sta [_ZP],y
 
 
 
 
 BYTE size array. WORD length array.
  char array1[2048];
 .
 .
 .
 array1[x] = var;
 
is.. lda _x
 clc
 adc #low(_array1)
 sta <_ZP
 lda #high(_array1)
 adc #$00
 sta <ZP+1
 
 lda _var
 sta [_ZP]
 
 
 All examples assume the global array define is near, which is should be since it's work ram. Reading from an array is a little bit different since the data could be far. You better know before hand.
 
 Here are some optimizations examples:
 
 WORD size array (split array). BYTE length array.
  char array1_lo[256];
 char array1_hi[256];
 .
 .
 .
 array1_lo[x] = var;
 array1_hi[x] = var>>8;
 
is.. ldx _x
 lda low(_var)
 sta _array1_lo,x
 lda high(_var)
 sta _array1_hi,x
 
 You can see the above example is the same as "int array1[ x ]=var", but less code. Also, in the later examples - don't forget if you're accessing the array's sequentially, you can optimize for that.
 
 
 WORD size array. WORD length array.
  int array1[2048];
 .
 .
 .
 array1[x++] = var;
 array1[x++] = var2;
 array1[x++] = var3;
 
is.. lda low(_x)
 asl a
 sta <_ZP
 lda high(_x)
 rol a
 sta <ZP+1
 lda #low(_array1)
 adc <_ZP
 sta <_ZP
 lda #high(_array1)
 adc <_ZP+1
 sta <_ZP+1
 
 lda low(_var)
 sta [_ZP]
 lda high(_var)
 ldy #$01
 sta [_ZP],y
 
 iny
 lda low(_var2)
 sta [_ZP]
 lda high(_va2)
 iny
 sta [_ZP],y
 
 iny
 lda low(_var3)
 sta [_ZP]
 lda high(_va3)
 iny
 sta [_ZP],y
 
 I'll post some more examples later.
 
- 
				Ok... back.
 
 So declaring an array is the same thing as a "static" pointer in C. Or a constant pointer. Because that pointer never changes.
 
 This means you can access small arrays in work ram, with direct addressing plus indexing (as shown in the BYTE/BYTE example). I not entirely sure, but I think Huc, being Small C, doesn't allow the creation of pointers like you can in normal C. But that doesn't mean you can't do it in ASM.
 
  char *pointer;
 .
 .
 .
 pointer = &label_1
 
 equates to...
 
  lda #low(_label1)
 sta <_pointer
 lda #high(_label1)
 sta <_pointer+1
 
 So now 'pointer' holds the address of 'label1'
 
  *pointer=var
 pointer++;
 
  lda _var
 sta [_pointer]
 inc <_pointer
 bne .skip
 inc <_pointer+1
 .skip
 
 There are 128 possible 'address' vectors or pointer slots on the 65x. In HuC and/or the system card, some of those are reserved so you actually have less than that. But in reality, you don't usually need more than 20-30 pointers/address vectors. The cool thing about the 65x's hardware ZeroPage registers, is that since there are soo many address vectors definable, it's not a problem just to leave the vector setup. I.e. you don't have to keep loading and unloading pointers when you need to use them.
 
 The 65x allows a few other things. First thing to note is; all indexing on the HuC6280 is free. And you can index pre and post index registers. Here are some examples:
 
  *pointer[x++]=var;
 
  ldy _x
 lda _var
 sta [_pointer],y
 inc _x
 
 In that example, the pointer isn't destroyed or altered. This means you can randomly and quickly access up to 256 elements from the base pointer address. Very fast and flexible. The only downside is the limited size of the indexing to 256, but there are clever and fast ways to extend this further.
 
 Another example of pointer flexibility:
 
  *(pointer_array[x])=var;
 
  lda _var
 ldx _x
 sta [_pointer_array,x]
 
 So you can index a pointer table array. I hope my C syntax (C99) is correct on those, but if not - you should be able to get the idea I'm trying to convey.
 
 Now, this has all been for accessing near data. Accessing far is exactly the same thing, except for one minor difference. Far data needs a 24bit address. Unfortunately, Hudson didn't add a long addressing mode to the custom R65C02S. It's not too big of a problem though, but it does require prioritizing and optimization of you need specific access in relation to speed. Basically how you map out your logical cpu address range.
 
 Any, for far data - the process I've described above is exactly the same. It just requires you mapping in a far page/block of memory, into the local address range. Once this is down, all that I've written applies exactly. Work ram, you never want to map that out - except for extreme conditions. Mostly because the stack and address vectors get replaced with whatever you map in - in that address range/page. You'll definitely have a problem with interrupts and such. So leave that page alone. The last page and the very first page also, normally shouldn't be changed. If you design your code from the ground up, it's not much a problem. But if you plan to use HuC or any of the Mkit libs or setup, you're mostly restricted to leaving that 24k fixed in local address range.
 
 Not sure if I'm forgetting anything :P
 
 
 Edit: Oh yeah. Some closing information/suggestions.
 
 Most array/pointer data is accessed in some sort of sequential method. Take advantage of this. You can using the free indexing even if the source array/requires are longer than 256bytes/128words. Just use the indexing reg as a counter, then increment the MSB of the pointer. Like such:
 
  for(int x=0,x<384,x++)
 {
 *pointer &= 0x03;
 *pointer++;
 }
 
  clx
 .loop_outer
 cly
 .loop_inner
 lda [_pointer],y
 and #$03
 sta [_pointer],y
 iny
 cpy #128
 bcc .loop_inner
 inx
 cpx #03
 beq .out
 tya
 clc
 adc <_pointer
 sta <_pointer
 bne .loop_outer
 inc <_pointer+1
 bra .loop_outer
 .out
 
 384 is an easy multiple of 128. So here's an example of a variable length loop, but still using indexing for addressing and a counter:
 
  ; lets assume len is 521
 for(int x=0,x<len,x++)
 {
 *pointer &= 0x03;
 *pointer++;
 }
 
  clx
 .loop_outer
 cly
 .loop_inner
 lda [_pointer],y     ;7
 and #$03             ;2
 sta [_pointer],y     ;7
 cpy low(_len)        ;5
 beq .check_msb       ;2
 .cont
 iny                  ;2
 bne .loop_inner      ;4
 inc <_pointer+1
 inx
 cpx high(_len)
 bne .loop_outer
 bra .out
 .check_msb
 cpx high(_len)
 bne .cont
 .out
 
 
 Compare that with this normal, non index method:
 
 .loop
 lda [_pointer]       ;7
 and #$03             ;2
 sta [_pointer]       ;7
 
 lda <_pointer        ;4
 clc                  ;2
 adc #$01             ;2
 sta <_pointer        ;4
 lda <_pointer+1      ;4
 adc #$00             ;4
 sta <_pointer+1      ;4
 
 cmp high(_len)       ;5
 bcc .loop            ;4
 
 lda <_pointer+1      ;4
 cmp low(_len)        ;5
 bne .loop            ;4
 .out
 
 The index version of the loop is 29 cycles per single cycle of the for/loop. You have a tiny amount of over head on X rollover - which happens once every 256 increment of Y. That translate into less than a single cycle over all into the loop count of cycles.
 
 On the other hand, the normal method is 49 cycles per single cycle of the for/loop. And when the MSB is aligned, you put on another 13 cycles on top of that 49 cycles per single cycle of the for/loop. So if 'len' was $1ff, the last $ff of cycles would be 62cycles per instance. So if len was $1ff, the average would be 55.5cycles.
 
 55cycles VS 29cycles. The index method is more abstract, but clearly the winner. 1.9 times as fast. If you did the same thing in HuC, it'd take around 200 or more cycles per for/loop instance.
 
- 
				wahouuu, great  =P~
 Thank you very,very much ..
 
 I 'll need many century for undestand all  :mrgreen:
 
 Your exemples are useful for my, to understand how arrays work in asm.
 
 My fisrt exemple work fine for load a 16 bits array to an int  :wink: (and it's simple) ..
 
 Not sure if I'm forgetting anything Razz
 
 
 Yes, in fact only my question  :-"
 :wink:
 
 I use first only inline ASM (i'am not completely mad  :mrgreen:)
 i have declared in huc an int pointer .
 I'll want to use multiples int arrays in a same function who contain ASM.
 
 This pointer take address of an int array : for exmple
 
 #incpal(logo_pic_pal,"pcx/tg16.pcx") (array containing all the bcgnd colors)
 
 int *datas_pointer;
 int var;
 .
 .
 .
 .
 datas_pointer = logo_pic_pal;
 
 I 'll want to do this in ASM
 var = *(data_pointer++);
 .
 .
 .
 .
 var = *(data_pointer++)
 
 #asm
 i'll want to read datas_pointer values here  8)
 #endasm
 
 
 I 'am initializing  datas_pointer with logo_pic_pal's address in C before.
 
 And i 'll want to read datas in asm with pointer datas_pointer !!
 
 if i'am declaring datas_pointer as an int array, and i'am copying all logo_pic_pal in, i can read all values perfectly.
 
  sta [_pointer],y
 
The same for load  :-k
 
 Ah , if i do, for reading datas_pointer content at his start address
   lda [_datas_pointer] (for exemple)
 
 Compiler says " Incorrect zero page address"
 
 Huum [_datas_pointer] or <_datas_pointer ,seems not working on HUC  #-o
- 
				Compiler says " Incorrect zero page address"
 
 Huum [_datas_pointer] or <_datas_pointer ,seems not working on HUC
 
 Correct. HuC is not optimized to keep pointers in Zeropage. The compiler is simple, and unfortunately treats the 65x like any other load/store processors. That makes the design simple, but the speed slow.
 
 This means you have to manually copy the pointer into ZP address vector. And the catch is.... HuC itself doesn't allow you to define anything in ZP, be it a fast ram variable or address vector/pointer. So, you must define a ZP variable in ASM. Either in startup.asm or somewhere in the asm library, or using the #ASM/ENDASM function near the beginning of the main C file.
 
 #asm
 .zp
 __pointer1: .ds 2
 __pointer2: .ds 2
 __pointer3: .ds 2
 ;etc
 #endasm
 
 then you have to prep the pointer because it's null.
 
  lda low(datas_pointer)
 sta <__pointer1
 lda high(datas_pointer)
 sta <__pointer1+1
 
 now you can do
  lda [__pointer1]
 
 __pointer1 will always keep its value. The compiler won't erase it or use it. It has its ZP vectors reserved for internal operations.
 
 
 
 When/if you get more comfortable with HuC and ASM, I can show you how to have complete control over the C function style of calling and passing parameters. It's directly compatible with the assembly side. You can use registers, ZP fast ram, pointers, etc to pass arguments directly to an ASM function, but the compiler will treat the function as if its internal. You can write whole functions in pure asm (and without using #asm/endasm) and not have to worry about a lot of compiler overhead/memory layout issues. But to be honest, if you get that comfortable/familiar with the system - you there isn't really a need for HuC at that point at all. You'll want to reclaim all that wasted lib space HuC and Mkit eat up.
 
 
 
- 
				Thanks man, your explanations are clear ..
 
 For the moment i'am not familiar enough with pc-engine hardware, for migrate in all asm, or advanced asm functions,but i keep your proposition  :wink:  ..
 Cool stuffs of asm are not easily mixable with HUC for a noob ..
 
 I'll try your soluce for pointer ..
- 
				hi i have included
 #asm
 .zp
 __pointer1: .ds 2
 #endasm
 
 before any huc code.
 
 now <__pointer1 or [__pointer1] is working, but this pointer is not available directly in HUC  :cry:
 I must pass it with an intermediate global variable declared in HUC.
 
 i have tried this
 #asm     
 ldx #$0
 lda logo_pic_pal,x
 sta #low(__pointr)
 inx
 lda logo_pic_pal,x
 sta #high(__pointr)
 #endasm
 
 And " incorrect addressing mode"   re  :cry:
 
 I thing that i must use only < and [] for ZP pointers .
 
 I have tried this
 #asm          
 lda #low(_logo_pic_pal)
 sta <__pointr
 lda #high(_logo_pic_pal)
 sta <__pointr+1
 #endasm
 
 Equate to :
 int logo_pic_pal[16];
 int *pointr;
 
 pointr = logo_pic_pal;
 
 ???
- 
				YEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHHHHH   :dance:
 
  #asm          
 lda #low(_logo_pic_pal)
 sta <__pointr
 lda #high(_logo_pic_pal)
 sta <__pointr+1
 lda [__pointr]
 sta _val
 #endasm
 
 this is working fine and it's equate to ..
 
 int logo_pic_pal[16];
 int *pointr;
 int val;
 
 pointr = logo_pic_pal; /* pointr take logo_pic_pal address */
 val = *pointr;  /* take the content of pointr address */
 
 Arf f*ck, it's working only for byte  :cry:
 
 
 i have seen my problem  :dance:
 i take only the low part of variable. [-(
 
 YEAAAAAAAAAAAAAAAAAAAAAH touko 2 asm 0  :clap:
 #asm       
 lda #low(_logo_pic_pal)
 sta <__pointr
 lda #high(_logo_pic_pal)
 sta <__pointr+1
 lda [__pointr]
 sta _val
 inc <__pointr
 lda [__pointr]
 sta _val+1
 #endasm
 
 Thanks tom for your patience and exemples, i have a good start for make in line ASM now  :dance:
- 
				hi i have included
 #asm
 .zp
 __pointer1: .ds 2
 #endasm
 
 before any huc code.
 
 now <__pointer1 or [__pointer1] is working, but this pointer is not available directly in HUC  :cry:
 
 
 Yes But... the whole point of using asm for pointer handling is, is that HuC's pointer handling is extremely slow. Remember, you can declare a lot of pointers in ZP.
 
 So this:
 #incpal(logo_pic_pal,"pcx/tg16.pcx") (array containing all the bcgnd colors)
 
 int *datas_pointer;
 int var;
 .
 .
 .
 .
 datas_pointer = logo_pic_pal;
 becomes this:
 
 #incpal(logo_pic_pal,"pcx/tg16.pcx") (array containing all the bcgnd colors)
 #asm
 .zp
 _datas_pointer: .ds 2
 #endasm
 int var;
 .
 .
 .
 .
 #asm
 lda #low(_logo_pic_pal)
 sta <_datas_pointer
 lda #high(_logo_pic_pal)
 sta <_datas_pointer+1
 #endasm
 
 You can skip the intermediate step and directly name the pointer in ASM. The only downside to this is, it looks kinda ugly. HuC doesn't support inline macros. It would be nice because you could have a C inline macro be an ASM chunk of code, and have it function more C like. Another poor thing with HuC is that you can't have #asm and #endasm on the same line, otherwise you could use the define directive in conjunction with ASM macros.
 
 
 i have tried this 
 #asm     
 ldx #$0
 lda logo_pic_pal,x
 sta #low(__pointr)
 inx
 lda logo_pic_pal,x
 sta #high(__pointr)
 #endasm
 
 And " incorrect addressing mode"   re  :cry:
 
 
 You can't write to a #. "#" means immediate, not address. Also, logo_pic_pal needs a "_" in front of it. Any global label, variable, array defined in HuC gets a underscore in front of the name. This is so HuC declarations don't conflict with the ASM side. Remember, HuC builds out a complete ASM file, not a binary. HuC calls the assembler to build the binary file of the ASM file, that HuC builds itself. And lastly, you don't want to use "x" indexing for logo_pic_pal. It's not an array of pointers. It's a single address. Think of "#" in front of the label as "&" in C.
 
 
 
 I thing that i must use only < and [] for ZP pointers .
 
 I have tried this
 #asm          
 lda #low(_logo_pic_pal)
 sta <__pointr
 lda #high(_logo_pic_pal)
 sta <__pointr+1
 #endasm
 
 Equate to :
 int logo_pic_pal[16];
 int *pointr;
 
 pointr = logo_pic_pal;
 
 
 Correct. ZP is a special place in ram. To access it as fastram, you need to use the "<" operator. But ZP is also used as address vectors. If you think about it, the 65x processors have no internal 'address' registers. The z80, has up to 3 address registers (IIRC) using pairs of normal registers. It uses half your registers just for 2 address vectors. 68k has 8 address registers, with 1 being reserved for stack. So 7 address vectors. The 65x can have up to 128 address registers, but you rarely need more than say.. 10-20 address vectors or "pointers".
 
 To access ZP ram as address vectors, you need to use the "[]" brackets. You also don't need to use the "<" operator, because only ZP address range can be accessed as vectors.
 
 
 Here's a few examples of EA syntax:
 
  lda #$00   ;<- load an immediate into register Acc. There's also LDX and LDY. LD stands for load and A/X/Y is the register. It's equivalent to move.b #$00,A
 
 lda _logo_pic_pal   ;<- _logo_pic_pal is a label for an address. This is a 16bit address, so it equates to lda $1234 (if $1234 is the address). This loads a byte from memory 
 
  lda #_logo_pic_pal   ;<- # means load an immediate. Like the first example. But _logo_pic_pal is a 16bit address. Since the processor is little endian, the assembler will ignore the top half of the 16bit address number and only load the LSB of that 16bit number.
 
  lda #low(_logo_pic_pal)   ;<- this is the same as the above example, but we use instead to make the code more clearer. That we are getting the LSB.
 sta somwhere
 lda #high(_logo_pic_pal)  ;<- The same thing as low(), but instead we grab the MSB of a 16bit value.
 sta somewhere+1
 
 Let's say you have a ZP address label as _pointer. Let's say _pointer is ZP address $05. ZP address range is 8bit. So the only addresses you can define are $00 to $ff. So _pointer is ZP address $05. Now, ZP address range IS external. So it has to exist somewhere in the CPU's logical address range. On the HuC6280, this is logical/local address range $2000-$20ff. So _pointer is actually $2005.
 
 sta <_pointer
 sta _pointer
 
 Are both the same address. They translate to:
 sta $05
 sta $2005
 
 As you can see, one of the instructions is using a shorter addressing mode. This means 1 less byte is required to form the address. This translates to a faster addressing mode. Other 65x assemblers, you don't need to use the "<" operator. They see "sta $05" as short addressing. PCEAS (the PCE assembler) doesn't see this. It requires that you use a "<" to tell the assembler to use short addressing mode. If you use "sta $05" directly in PCEAS, it will pad that to "sta $0005". Which is totally incorrect. So it's really important that you use the "<" when you're using ZP as either "fast ram" or address registers/vectors/pointers.
 
 And like I said, if you're using any ZP address range for vectors/pointers - you need to use the "[]" brackets. And without the "<" operator. That trips up some people.
 
 And one more thing to remember; you don't always need to use pointers to access arrays that are in ram. You can use direct addressing with indexing. It's faster than using a pointer.
 
 Question: Do you know 68000 assembly?
 
 
 
- 
				
 YEAAAAAAAAAAAAAAAAAAAAAH touko 2 asm 0  :clap:
 #asm       
 lda #low(_logo_pic_pal)
 sta <__pointr
 lda #high(_logo_pic_pal)
 sta <__pointr+1
 lda [__pointr]
 sta _val
 inc <__pointr
 lda [__pointr]
 sta _val+1
 #endasm
 
 
 
 Yup, you got it! :D
- 
				Thanks for explanations tom, with your exemples they are a gold mine  8) ..
 
 In my student period, i have learn a little bit of 6809 and 68K assembly, but it's faaaaaar in my mind ..
 
 It's very funny to use inline assembly in a huc project, but not an entire game  :mrgreen:..
 I have noticed with your first exemples, the difference between
 
 - #low/#high and low/high
 - </[]
 
 I have noticed something strange in HUC ..
 
 if i do
 #incpal(logo_pic_pal,"pcx/tg16.pcx")
 
 Datas in logo_pic_pal are not accessible (in C ,like ASM) until i do
 logo_pic_pal[0];
 
 Do you know why ???