Author Topic: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc. Tools for development (Read 22566 times)

touko · « **Reply #120 on:** August 10, 2016, 04:01:48 AM »

I have an idea for map's collisions detection.
For now I'm using detection directly in VRAM ( very convenient for shooters ), but how to take in account many possibilities as dectructible or not, instant death , walkable or not ?.
If you do not have too many possibilities, the deal is to use tiles's pallettes, you can have up to 16 possibilities.
But you have to deal also with the wrap around scrolling (if your game scroll of course).

elmer · « **Reply #121 on:** August 10, 2016, 06:22:26 AM »

Quote from: touko on August 10, 2016, 04:01:48 AM

If you do not have too many possibilities, the deal is to use tiles's pallettes, you can have up to 16 possibilities.

Ouch, that sounds like a terrible waste of palettes!

Traditionally you'd use a separate collision map in main RAM/ROM with all the info that you need.

You main map would either be tile (8x8) or block (16x16) based.

Then you can either have a full collision map of the same size, or just index the collision/properties based upon the tile or block number.

You can either do the properties as 1 byte per tile/block, or if you just need a number of 1 bit flags, it's easy to access up to 2048 different tile/block attributes with a "bit attribute_table,x" instruction.

This sort of thing does (of course) need some tool-support to be usable in practice.

I believe that Mappy and ProMotion, for one example, both support a separate collision layer.

Now that ProMotion has dropped the price on the full product, and finally has an older version for free, I can't see much reason for folks to avoid using it.

touko · « **Reply #122 on:** August 10, 2016, 09:29:19 PM »

Quote

Traditionally you'd use a separate collision map in main RAM/ROM with all the info that you need.

Of course, but the idea is to avoid that file.

Gredler · « **Reply #123 on:** September 01, 2016, 06:06:11 AM »

This script for Photoshop popped up on a vfx artist group I am in, I am going to give it a shot for some vfx for a side project I am working on, but thought this was something in someone here might find useful.

The script creates animation sheets from a layered file of frames of animation,

https://github.com/bogdanrybak/spritesheet-generator

Bonknuts · « **Reply #124 on:** November 11, 2016, 01:14:11 PM »

Extended dynamic tiles: Take the idea of dynamic tiles to a more advance approach.

Take this current image here:

The dynamic tiles for this set would be something like this:

The red arrows are to indicate that the image shifts 8 pixels to its destination form. Typical for dynamic tiles, but here's the catch - they don't wrap. Instead, when the transition is the to last frame, or to the first frame, you update the tilemap itself by shifting these over entry in the map (left or right).

In this demo, I setup a block of 256 tiles to be the dynamic tileset. Of course, only 121 tiles are used in this example.

Why do this? Because you can make a whole second BG layer, as a real map layer, that can have these above "objects" anywhere in the map. The reason why they can be placed next to each other, is that there's also one common dynamic tile on the right or left sides to ALL the objects. The skulls and the lava actually can't be placed on the same line (map line) going across the screen for obvious reasons, but the window frames can be placed next to each other, at different vertical positions - etc. The "window" object represents the dynamic objects that I'm trying to show as example here. The skulls and lava are special case.

In this example, the second layer scrolls independently left or right, but not up and down - that's fixed. Only because the solution is a little more complicated, not that it can't be done. And in that case, the skulls would have to be sprites, an upper tile lava bubbles would also have to be sprites.

This type of effect isn't limited to dungeons or caves. Instead, imaging an open area where there is sky and clouds. Instead of the brick being the common connecting block, you could have a solid color sky block (say.. blue). You could even have different horizontal strips of clouds moving at different speeds (parallax), and the foreground would scroll as it's own layer (parallax clouds on far layer, foreground interactive layer). The cloud "objects" wouldn't have to be a fixed pattern or placement either, as long as they are separated by a single common block (8x8) in their "map" section. Each object can also have its own subpalette associated with it, so you're not limited to 16 colors for the whole fake BG layer.

Now for the resource part: This has to be all done in ASM. The tiles need to be embedded opcodes for speed. 256 tiles @ 4bit color depth takes 41k cpu cycles to update in a single screen, or 34% cpu resource. Of course, this being the far background layer - it typically scrolls slower than the foreground layer for most games (not all, though). In that case, assuming the far BG layer scrolls a half speed or multiples of half speed - you can divide that 34% requirement over two frames. And use the VDC VDMA transfer (set the res to high res mode before doing this) to move the final buffer of dynamic tiles over the map section, of if you like - keep two copies of the BAT active area in vram, updating it as needed for both, then switch to the alt one once the dynamic tile sequence is finished.

It's also not as easy as that. The BAT has to be updated. This is a read, test, conditional modify, write method. The fast way I could find to go about it, cost 30% cpu resource (updates a whole screen), but a more flexible method I did that read and wrote vram was 32% cpu resource. Again, depending on how you're using this advance dynamic tile setup - that could be divide over three frames; ~10.5% per frame. I won't go into details about that, because it would be easier to show what I mean in some demo code. But the above game (PC dos game) would benefit from that.

And honestly, I'd probably mix a little sprites as BG objects (edges) like Ys III does just to break up the hard edges.

tl;dr
You can do a whole screen with specials objects inside a dynamic tile system without having to write to a full screen buffer. And of course, doing more advanced than the simple 16x16 block pattern of typical dynamic tiles done in PCE games.

Punch · « **Reply #125 on:** November 11, 2016, 03:01:48 PM »

bonknuts, the second image's not showing

Bonknuts · « **Reply #126 on:** November 11, 2016, 04:13:59 PM »

Ok. Fixed. How about now?

Bonknuts · « **Reply #127 on:** November 11, 2016, 05:00:47 PM »

Here's an example with the cloud objects..

This has seven dynamic objects; 212 total tiles. As you can see, the common tile for all of them is the leading blue tile (column). The dynamic objects can be placed anywhere in 8x8 grid location, as long as the one column of common tile separates them.. though they could be separated by more for whatever reason; that would be placement in a tilemap strip.

This isn't the best example, but it should show something other than classic brick style that's so common on the PCE.

Also note, but not pictured above, object can be joined by a common column of tiles - think about how mountain ranges are connected, or clouds are normally joined in tilemap setups. So it's possible to have those types of connections too. Or actually mix it up; certain objects belong to certain common column sharing (tiles).

Lastly, the red "1" and "2" would represent two different tilemap strips at different speeds. But the draw back to this, is the objects in the second column would need their own distinct dynamic set. They can't share objects of "different speeds" for obvious reasons (the updates to the animation isn't the same between the two map strips). But it allow parallax on the pseudo layer.

touko · « **Reply #128 on:** November 11, 2016, 09:36:03 PM »

If you have less than 192 tiles (in H32) and enough vram, i think it's better to use DMA for dynamic tiles .
In your first exemple 121 tiles is 31ko of vram, easily doable(obviously more for SGX) .
You can also mix the two techniques .

You can maybe use a 1 or2bp for 2nd layer's tiles,this technique was very used on C64 games :
https://youtu.be/x_1mMhJP6Xo?t=4m46s

https://youtu.be/x_1mMhJP6Xo?t=14m39s

Note tiles was also used for player shoots .

Of course is like you call "classic brik style" but i think done cleverly,it can do a very good parallax/2nd layer effect .

Bonknuts · « **Reply #129 on:** November 12, 2016, 07:03:04 AM »

Quote from: touko on November 11, 2016, 09:36:03 PM

If you have less than 192 tiles (in H32) and enough vram, i think it's better to use DMA for dynamic tiles .
In your first exemple 121 tiles is 31ko of vram, easily doable(obviously more for SGX) .

Well, if it's all sitting in vram with all 8 frames, then there's no need to even do a vDMA since you have direct access to any from *and* you are modifying the map each frame.

And that's an option. And would make parallax parts of the map (pseudo layer) of any object pretty easy to do (since you have access to all objects, scroll speed of the object is directly related to which sets of frames you point).

But generally, I don't like wasting vram like that unless there's a really big benefit for doing it.

Quote

You can maybe use a 1 or2bp for 2nd layer's tiles,this technique was very used on C64 games :
https://youtu.be/x_1mMhJP6Xo?t=4m46s

https://youtu.be/x_1mMhJP6Xo?t=14m39s

Note tiles was also used for player shoots .

Of course is like you call "classic brik style" but i think done cleverly,it can do a very good parallax/2nd layer effect .

Yeah, but those are different because it's one single repeating pattern (brink) - different subject and different approach. The object method uses a map that allows any configuration and placement of those animated objects.

touko · « **Reply #130 on:** November 12, 2016, 07:19:43 AM »

Quote

Well, if it's all sitting in vram with all 8 frames, then there's no need to even do a vDMA since you have direct access to any from *and* you are modifying the map each frame.

It's more difficult and is consuming more CPU to modify each tiles entry than swapping tiles datas IMO,if you have enoug VRAM to spare, VDMA is almost free .

Quote

you are modifying the map each frame.

What do you mean

, by hand with CPu ??

Quote

But generally, I don't like wasting vram like that unless there's a really big benefit for doing it.

Of course you're right, but if you have VRAM to spare why not ??
You can freeing CPU for others purpose ;-) .

Bonknuts · « **Reply #131 on:** November 12, 2016, 08:10:56 AM »

Quote from: touko on November 12, 2016, 07:19:43 AM

Quote
Well, if it's all sitting in vram with all 8 frames, then there's no need to even do a vDMA since you have direct access to any from *and* you are modifying the map each frame.
It's more difficult and is consuming more CPU to modify each tiles entry than swapping tiles datas IMO,if you have enoug VRAM to spare, VDMA is almost free .

More advance effects need more difficult setups and more cpu resource.

Quote

Quote
you are modifying the map each frame.
What do you mean , by hand with CPu ??

Just as I said earlier, the pseudo BG layer is made up of dynamic tile objects - no simple brick style repeating pattern. Those objects are attached to a separate map layer that gets composited into the regular BAT layer. Once each object completes its frame rotation, it gets set back to #0 and the tilemap is updated with the new position (advance the pseudo tilemap layer to the next 8x8 entry and do a new composite into the regular map/bat).

Quote

Of course you're right, but if you have VRAM to spare why not ??
You can freeing CPU for others purpose ;-) .

*If* you have it to spare, sure. But it highly depends on the setup (how many tiles you want to use or how many sprite frames you want to have in memory to keep updating bandwidth down to a minimum).

touko · « **Reply #132 on:** November 13, 2016, 01:00:27 AM »

Quote

Just as I said earlier, the pseudo BG layer is made up of dynamic tile objects - no simple brick style repeating pattern. Those objects are attached to a separate map layer that gets composited into the regular BAT layer. Once each object completes its frame rotation, it gets set back to #0 and the tilemap is updated with the new position (advance the pseudo tilemap layer to the next 8x8 entry and do a new composite into the regular map/bat).

Ah ok, i see now ..

Quote

*If* you have it to spare, sure. But it highly depends on the setup (how many tiles you want to use or how many sprite frames you want to have in memory to keep updating bandwidth down to a minimum).

i agree,and i think it's more suited for SGX than PCE.

Bonknuts · « **Reply #133 on:** November 15, 2016, 06:22:55 AM »

With context this might seem a little confusing, but..

Code: [Select]

	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	bbr0 zp0,.skip0
	rts
.skip0
	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	st2 #$xx
	bbr1 zp0,.skip1
	rts
.skip1

(Doesn't have to be all ST2 opcodes; can be st1/st2 as well)

I.e. you can break up long runs of pixel writes as short blocks, and control the length (in a course amount) with a bitmask in a series of ZP variables. In this example, I'm writing lines instead of columns because I have the VDC write incrementor set just right.

My transparency demo (that uses TF4 BG) could really benefit from this. You can do dynamic tiles are columns, or as single bitmap lines (the VDC allows either write method). Each have their advantages and disadvantages. Column writing allows easy re-positioning to make a large area scroll horizontally with only have frames - but it's more complicated if you do vertical scrolling (shifting). Line mode allows vertical scrolling, as well as doing hsync sine wave effects and vertical scaling effects, as well as easy vertical mirroring - but is more difficult to scroll horizontally.

All this is in relation to really large "brick style" dynamic blocks. Stuff half the size of the screen, or possibly larger than the screen itself (I have such a demo effect that uses this, it just needs a real demo to be part of).

The TF4 transparency demo for PCE, if anyone has seen it, basically leaves the first two planes of a PCE tile (p0,p1) for tile data. That's 4 colors, but more if you use subpalettes (3*16 + 1= 49 colors to be exact). The second composite tile of the 4bit tile is plane 2/3, which the cpu writes a large dynamic tileset data to. The cloud layer is made up of three colors, and the tiles are 4 colors. Each color of the could layer corresponds to a set of 4 hue tinted colors in the current subpalette. With color #0 on the cloud layer showing normal colors of the tile underneath it. Like I said, you can use different subpalettes for any of the tiles, as well as they all have a cloud hue tinted set in them (can be whatever and different from tile to tile).

Two issues with this approach for the demo: the background "area" that's affected by the transparency part needs to be actual bitmap buffer. This is easily done with tiles; you just stream the right edge of the screen (off screen) with a single column of tiles when needed. Not a big deal and barely any cpu resource to do this (the nice thing is you can do easily tile flipping support this way that the PCE normally doesn't support). In the TF4 pce TP demo, only the area where the cloud layer is, needs to be this bitmap thingy. The rest of the tilemap can be regular tiles, meaning the buffer doesn't need to be that large if you don't need it to be.

The second issues; the most efficient way to write the cloud layer. If you've seen the demo, you'll notice at some point that when the map keeps scrolling, transparency overlay gets stuck. That's because the demo was never finished. But it's also because the demo doesn't handle "wrap around". So what you're seeing is a linear stretch, and then something it can't handle (wrap around). To handle wrap around, you need to be able to write with specific start and stop points. In the TF4 demo, it does "line" mode. This allows it to write 1/8 of the whole image in one long st1/st2 opcode output to the VDC. To put that into prospective: 256x112 (I think that's the height of the cloud layer) would take 256x112 @ 2bit = 7,168 bytes to write to vram. At 5 cycles a byte, that's 35,840 cpu cycles or 30% cpu resource. In actuality though, since there are gaps in the cloud layer, those can be stores as ST2 opcodes - saving one write per 8x1 blank area. For the sake of example, let's say that is 15% blank space. That brings down the cpu write sequence to ~25% cpu resource. Now notice that the cloud layer is at the bottom of the screen? 7,168 * 5 cycles = 35,840 cycles / 455 cycles (one scanline) = 79 scanlines. This means I can actually do this during the top of active display; I have enough time to race the display, leaving vblank free and leaving the rest of active display free (I'll just assume active display is 224 scanlines tall). Another method, if the cloud layer was at the top of the screen instead of the bottom, would be the "trail" the display so the changes being made on that frame don't show as you're writing - blah blah blah.

Here's the video.. (touko uploaded it)

Draft: I have a little more to write, so I'll either update this post or just post some more..

Ok, so the second issue isn't cpu resource (at least not yet) but getting the dynamic offset image to the screen image buffer, and have it wrap around. One easy to do to his is store the image as column data, and after you cycle through all 8 frames of the shifted image, offset the column to + 1. Of course, mod (%) by the length for wrap around. The concept is simple. But here in lines the problem; the composite tile format. While this helps us in letting the VDC do the transparency work for us (this is how plane format facilitates transparency effects - a crude way), the composite format is now a hindrance. For column writes, can only write to one set of plane pairs, this means only 16 bytes can be written before you have to increment the vram pointer. This is going to take somewhere between 28-34 cycles *if* you embedded that into the graphic data itself (using A:X to hold vram offset, or X as an index to a table). That adds another 15k cycles on top of the 35k (unoptimized version; no gap optimiziation). Another approach is to write only one line of pixels per tile; write 1/8 of each composite tile. If the height of the image block to write to the screen is 112 pixels, that's 14 lines at 2 bytes each.. so 28 bytes written before a vram pointer re-position is needed. 7,168 / 28 * 34cycles = 8,704 cycles. Not bad. Almost cut the overhead by half.

If we know the block of data is wider than it is tall, we could optimize for that horizontal writes - but this introduces other problems. With column writing, it's easy to offset every 8 pixels when needed. Line writing doesn't allow that. If you break line writing up to smaller segments, like the original code show above, you can have a string of data at a smaller course length to the buffer. You can even jump into the middle of a string of data (opcodes), or anywhere from start to finish. If you think about this from the left side vs right side problem, dealing with wrap around when there isn't alignment, the right side is going to be the problem. The left side can be dealt with by jumping into the middle or whatever offset of the stream length (above is 8 pixel writes, to segments of 8 pixels - and the shift frame takes care of the intra pixel offset inside that 8pixel segment).

But how to deal with the right side. One way is to handle an over spill area. This is an offscreen area allows the extra data to be ineffective to the display. The downside is now the buffer is a little bit wider. Not doing an over spill area means having to write out manually the remaining bytes (pick your poison). Both methods are more complex than the column mode, and both methods require a good size jmp/jsr table for offsets. They also require data of "sets" (shift sets) to be banked align so that one routine works for all data shift sets. And to top it off, you still have to reposition the vram pointer once per "line" write (if you start from the left first, this is only once per line even on a wrap around point). What's the advantage in cycles? Well, the current cloud layer is something like 256px wide. So that's 32 paired writes (32 8x1 @2bit line segments) for 64 bytes. The code for 8x1 cells has an overhead of 8cycles (BBR instruction), so that averages out to be 5.5 cycles a byte written in a block of 8 segments (16 bytes). And only one 26-34cycle over head for vram reposition. But you'll have overhead from the spill area write, each line, to include into that account.

In the end, the line method will be super convoluted and might only be slightly faster than the column mode version, and generating those tables for the offsets is going to be a huge pain in the ass - but all said and done, the line method would allow you to sine wave effects both horizontally and vertically like with a normal PCE map/bg layer, on top of having vertical scrolling ability too (animate the layer scrolling up from the bottom). The column method is easier, but can't do anything like the line method can do. So like I said, super convoluted but it's also a one and done type of deal. Once working, it'll be a really powerful effect for the PCE.

As far as those jump tables are concerned, I'd most surely write a PC app to generate that code. No amount of macros in PCEAS is going to make that an easy job.

Is this extreme? You bet. But is this doable? Completely. And from cpu resource perspective, incredibly doable. It might not be representative of what any dev would do back in the day, but this isn't what that's about. This is about pushing the system to its limits - to see what it can do.

Just to note: the cloud layer does not have to remain static. It can scroll at its own speed in either direction (right or left). Both methods work, and both methods allow the cloud layer to scroll left or right, but line method allows for additional effects to be applied to that layer.

Also note, the transition line.. right above the cloud layer - those are no longer 2bit tiles. They're 4bit tiles, a allowing the mountain range to use 4 colors total, and still have a 5th one as well as any static cloud pixel data (more colors). So no, the whole screen doesn't need to be made up for 2bit colored tiles. But even for the areas that are, you still have subpalettes to break up the color usage, and that the transparecy layer will still apply to those subpalettes.

touko · « **Reply #134 on:** November 15, 2016, 06:52:43 AM »

I love this demo ;-)

Author Topic: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc. Tools for development (Read 22566 times)

touko

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

elmer

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

touko

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

Gredler

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

Bonknuts

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

Punch

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

Bonknuts

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

Bonknuts

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

touko

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

Bonknuts

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

touko

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

Bonknuts

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

touko

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

Bonknuts

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.

touko

Re: Graphic, Sound, & Coding Tips / Tricks / Effects / Etc.