overtune2005
Speriamo di vedere l'implementazione in qualche gioco
In genere tendo sempre a quello, ma per adesso sto solo sperimentando. Ovviamente, piacerebbe anche a me (pur se un po' mi dispiacerebbe fare qualcosa che non gira su Amiga inespanso). Chissà...
Intanto, le informazioni ricevute da altri hanno rivelato che PVE gira più velocemente su 68030 che su 68060. Sto provando a capire dov'è l'inghippo, ma, sfortunatamente, ho solo un 68030.
Se qualcuno ha un Amiga con 68040 o 68060, potrebbe farmi la cortesia di provare la versione sperimentale scaricabile da qui e passarmi i risultati stampati all'uscita? Anche rapporti relativi a 68020 sono i benvenuti!
Per ottenere risultati comparabili, il test va eseguito così:
1. lanciare il programma;
2. cliccare il tasto sinistro del mouse nella schermata iniziale;
3. aspettare 5 secondi (senza toccare il joystick nel frattempo);
4. cliccare il tasto sinistro del mouse.
Grazie in anticipo!
RETREAM - sogni retro per Amiga, Commodore 64 e PC
amiwell
Post inviati: 12885
Commento 22
amiwell79
2 Dicembre 2023 18:53:37
ottimo lavoro saimo ho visto la deo su youtube tutto molto interessante,continua così
v1.1 (22.12.2023)
* Reworked screen buffering, so that the raster data is more efficiently written to CHIP RAM when bitplanes DMA is inactive.
* Improved 68030 caches handling.
* Added 68040 and 68060 caches handling.
* Added MMU handling to avoid that the MMU affects the speed negatively.
* Optimized rendering core by making it write the dots sequentially.
* Made a little 68060-specific code optimization.
* Ensured 68060 susperscalar dispatch is enabled.
* Added live-toggable staggered lines video filter, which helps see better colors on devices that do not support SHRES and reduces the jailbars effect on devices that support SHRES (to enable/disable: [F1] ).
* Made fps indicator live-togglable (to enable/disable: [F2] ).
* Made quitting from the voxel screen return to the splash screen.
* Replaced mouse controls with keyboard controls.
* Added benchmark function.
* Added command line switches to control the CPU caches.
* Fixed bug that caused a longword to be written to a random location when the fps indicator was on.
* Fixed an innocuous initialization bug.
* Made cleanup code more robust.
* Updated, extended and fixed documentation.
RETREAM - sogni retro per Amiga, Commodore 64 e PC
Post inviati: 703
Commento 24
saimo
27 Marzo 2024 23:47:15
Era da lungo tempo che volevo riesumare del codice vecchio di 20 e più anni per divertirmici con PED81C. Finalmente mi sono deciso e ne ho tirato fuori un programma di test chiamato Zoomaniac.
I dettagli sono nel video e nell'estratto del manuale sottostanti. Download disponibile qui .
Zoomaniac has been written to evaluate the performance on a stock Amiga 1200 of
a general-purpose texture scaling routine that writes directly to a PED81C
raster.
The following results are relative to the full screen effect that zooms the
cosmonaut in and out.
On a stock Amiga 1200, the execution speed is between 25 and 26 fps. If the
staggered lines are turned on, the performance drops by about 1 fps (which was
unexpected, since all that such option adds is a Copper WAIT and a Copper MOVE
for each rasterline).
Given that the DMA load caused by PED81C is "double" (see its documentation for
the details), a version that uses only half the number (2) of bitplanes has been
made to check the performance as if the Amiga had a native chunky video mode.
Surprisingly, the performance did not improve at all: relatively to the CHIP bus
access, the scaling code must interleave so nicely with the bitplane data
fetches that having more bus cycles available does not make any/much difference.
An Amiga 1200 equipped with a 68030 clocked at 50 MHz and 60 ns FAST RAM easily
performs at steady 50 fps. To find out the maximum performance, new tests were
made with special versions of the program that had the video synchronization
code disabled.
The speed when running the program normally was between 77 and 78 fps. The
staggered lines option lowered the fps by about 2. The 2 bitplanes versions
performed better, reaching 80-81 fps or, with the staggered lines on, 79-80 fps.
Like on the stock Amiga 1200, the extended Copperlist that implements the
staggered lines causes a small and similar performance drop. Instead, the
halving of the bitplanes DMA load did produce a speed increase.
The following table sums up the results.
S = stock Amiga 1200
E = Amiga 1200 68030 @50 MHz / 60 ns FAST RAM (Blizzard 1230 IV)
2 = 2 bitplanes on
4 = 4 bitplanes on
L = staggered lines on
Notes:
* when FAST RAM is detected, an alternative and more suitable scaling routine
is used (although writes still happen to CHIP RAM);
* on (some?) machines equipped with FAST RAM an even faster strategy would be
rendering to FAST RAM and then simply copying at the maximum speed the
rendered frame to the CHIP RAM raster.
* The scaling routine fits any rectangle from a texture into a rectangle of any
size and ratio of another texture with nearest-neighbor matching.
* Logic and rendering are totally asynchronous: the logic runs always at 50 Hz
and the rendering never stops (unless it reaches the limit of 50 fps, imposed
by the display refresh rate), thus exploiting the machine's full potential.
* The screen buffering employs three buffers in CHIP RAM.
* The screen resolution is 1020x256 SHRES pixels, which correspond to 255x256
LORES-sized physical dots and to 128x256 logical dots.
* The code is 100% assembly.
* The program takes over the system entirely and returns to AmigaOS cleanly.
CHANGELOG
March 27, 2024
* Added the Zoomaniac demo.
* [PED81C Voxel Engine] Made a couple of minor changes.
* [PED81C Voxel Engine] Updated documentation.
January 1, 2024
* Rebuilt demos against latest custom framework.
* [PED81C Voxel Engine] Optimized slightly background rendering.
* [PED81C Voxel Engine] Corrected benchmark fps calculation (312 rasterlines were considered instead of 313).
* [PED81C Voxel Engine] Built against latest custom framework.
* [PED81C Voxel Engine] Updated, extended and fixed documentation.
Commento modificato il 27/03/2024 alle ore 23:48:13
RETREAM - sogni retro per Amiga, Commodore 64 e PC
Post inviati: 703
Commento 25
saimo
29 Marzo 2024 14:56:02
In seguito al feedback che ho ricevuto, ho rilasciato una nuova versione di Zoomaniac che permette di abilitare/disabilitare il limite degli fps per mezzo di [F3] .
Citazione
* The number shown in the top-left corner of the effects screen is the fps
indicator, which reports the number of frames rendered in the last second.
It is limited to 999.
* When the fps limit is on, the maximum number of frames rendered per second
is 50 also on the most powerful machines, as the display refresh rate is 50
Hz. When the fps limit is off, frames are rendered without pausing when the
previously rendered frame/frames has/have not (completely) displayed yet. On
machines which cannot run the program at 50 fps or more, turning off the
limit has no effect whasoever; on the other machines, the only visible effect
is that the fps indicator goes beyond 50, thus giving a measure of the maximum
speed that the machine can reach.
Inoltre, questa nuova versione gira più veloce di 1-2 fps su 68030 grazie al data cache burst:
Citazione
* on 68030 tests proved that: it is advantageous to turn the data cache burst
on when scaling a 128 dots wide rectangle to a rectangle wider than 8 dots
(i.e. with an X scaling factor greater than 1/16); with a scaling factor of
1/16 or less the difference proved to be minimal when both the source and
destination rectangles were 256 dots tall; considering that turning the data
cache burst off would therefore be advantageous only with very narrow and
tall rectangles (which are uncommon and intrinsically rather inexpensive),
it is not worth it to implement a data cache burst management inside the
scaling routine;
CHANGELOG
v1.1 (28.3.2024)
* Turned the 68030 data cache burst on for slightly faster performance.
* Made a couple of minor optimizations.
* Added frames rendering limit toggle ( [F3] ).
* Worked on fps indicator: added hundreds digit; made digits smaller; made digits auto-clearing, so that they read correctly also when they are not cleared before drawing.
* Made staggered lines toggle as soon as [F1] is pressed (instead of when it is released).
* Updated splash screen.
* Redesigned the 'M' in the logo.
* Updated and extended manual.
RETREAM - sogni retro per Amiga, Commodore 64 e PC
Molto bello. Aspetto qualche implementazione pratica, gioco.
Ma si può usare come driver video ottimizzato per gli emulatori tipo MAC shapeshifter? Grazie.
Post inviati: 703
Commento 28
saimo
29 Marzo 2024 22:55:29
@amiwell79
Grazie!
@overtune2005
overtune2005
Molto bello. Aspetto qualche implementazione pratica, gioco.
Anch'io!
Citazione
Ma si può usare come driver video ottimizzato per gli emulatori tipo MAC shapeshifter? Grazie.
In teoria si può usare per tutto, ma in pratica bisogna tenere conto delle limitazioni grafiche. Per il caso che menzioni non sarebbe adatto in quanto la risoluzione orizzontale è dimezzata (anche se la cosa viene mascherata parzialmente dall'interpolazione automatica tra colonne).
RETREAM - sogni retro per Amiga, Commodore 64 e PC
Post inviati: 703
Commento 29
saimo
2 Aprile 2024 22:22:19
Per avere un set completo di routine di ridimensionamento (che sperabilmente userò per qualcosa, un giorno), ho aggiunto il supporto per color-keying, zero-keying (color-keying con colore 0), e ribaltamento orizzontale e verticale.
Inoltre, visto che inizialmente mi ero focalizzato sull'A1200 di base, la performance su macchine espanse non era ottimale (poiché il rendering veniva fatto direttamente in CHIP RAM), perciò ho aggiunto anche un metodo di buffering alternativo che, quando 2 raster possono essere allocati in FAST RAM, permette di renderizzare in FAST RAM e poi copia il raster renderizzato nel raster in CHIP RAM il più velocemente possibile, a partire dal fondo dello schermo. Ciò, relativamente al primo effetto nel programma di test (che l'unico di cui è stata misurata la performance finora), ha prodotto un guadagno di 8-9 fps sul mio A1200 con 68030.
Il programma di test aggiornato (disponibile all'indirizzo https://retream.itch.io/ped81c ), per dimostrare le nuove caratteristiche, stira e rimpicciolisce una texture con color/zero-keying coprendo quasi tutto lo schermo, sopra uno zoom a tutto schermo di fondo, con tutte le combinazioni di ribaltamento possibili. Tutto ciò è chiaramete una faticaccia per un A1200 di base, la cui performance scende a 12-16 fps nei casi più pesanti.
(Mota a margine: il video è stato registrato prima che finalizzassi il programma di test, perciò mostra una schermata iniziale datata e degli scatti di zoom relativi allo sfondo quando si passa dagli/agli effetti di color/zero-keying.)
Questo estratto del manuale aggiornato fornisce ulteriori dettagli.
Zoomaniac has been written to evaluate the performance on stock and modestly-
accelerated Amiga 1200s of some general-purpose texture scaling routines in
conjunction with PED81C.
---------------------------------------------- ----------------------------------
GETTING STARTED
Zoomaniac requires:
* Amiga computer
* AGA chipset
* 170 kB of CHIP RAM
* 1.2 MB of any RAM
* PAL SHRES support
* keyboard
* 1 MB of storage space
To install Zoomaniac, unpack the LhA archive to any directory of your choice.
To start Zoomaniac, open the program directory and double-click the program icon
from Workbench or execute the program from shell.
If your monitor / graphics card / scan doubler do(es) not support SHRES, the
colors will look off or even not show at all. In such case, to hopefully fix the
colors a bit, try the staggered lines option.
* The staggered lines shift the odd lines by 1 SHRES pixel to the right. On
systems which handle SHRES correctly, that will reduce the jailbars effect
(but give the screen a kind of wavy look). On system which handle SHRES as
HIRES (for example, MNT's VA2000 graphics card and Irix Labs' ScanPlus AGA -
contrary to how is was originally marketed - display only the even or odd
columns of pixels, so only reds and blues or greens and grays show), that
helps improving the colors a bit (giving the screen a kind of scanline
effect). On other systems, the results are unpredictable, but the option is
still worth a try.
* The number shown in the top-left corner of the effects screen is the fps
indicator, which reports the number of frames rendered in the last second.
It is limited to 999.
* When the fps limit is on, the maximum number of frames rendered per second
is 50 also on the most powerful machines, as the display refresh rate is 50
Hz. When the fps limit is off, frames are rendered without pausing when the
previously rendered frame/frames has/have not (completely) displayed yet. On
machines which cannot run the program at 50 fps or more, turning off the
limit has no effect whasoever; on the other machines, the only visible effect
is that the fps indicator goes beyond 50, thus giving a measure of the maximum
speed that the machines can reach.
The following results are relative to the full screen effect that zooms the
cosmonaut in and out without flipping. The source textures are 256x512 dots and
the screen internally consists of 128x256 dots. Since a dot is represented by a
byte, 128x256 = 32768 bytes are fetched and written to render a frame.
On a stock Amiga 1200, the execution speed is between 25 and 26 fps. If the
staggered lines are turned on, the performance drops by about 1 fps (albeit all
that such option adds is a Copper WAIT and a Copper MOVE for each rasterline).
Given that the DMA load caused by PED81C is "double" (see its documentation for
the details), a version that uses only half the number (2) of bitplanes has been
made to check the performance as if the Amiga had a native chunky video mode.
Surprisingly, the performance did not improve at all: relatively to the CHIP bus
access, the scaling code must interleave so nicely with the bitplane data
fetches that having more bus cycles available does not make any/much difference.
An Amiga 1200 equipped with a 68030 clocked at 50 MHz and 60 ns FAST RAM easily
performs at steady 50 fps. To find out the maximum performance, tests were made
with the fps limit off.
The speed when running the program normally was between 84 and 86 fps. The
staggered lines option lowered the fps by about 1. The 2 bitplanes versions ran
at the same speed - in this case, that is because most of the CHIP RAM accesses
happen when no bitplanes DMA is going on (see TECHNICAL DETAILS section).
expanded Amiga 1200: Blizzard 1230 IV, 68030 @50 MHz, 60 ns FAST RAM
Notes:
* given that a stock Amiga 1200 reaches about 25.5 fps, it manages to render
128*256*25.5 = 835584 dots per second; considering that the 68020 is clocked
at 14.187580 MHz, rendering 1 dot requires about 14187580/835584 = 17 CPU
cycles;
* on 68030 tests proved that: it is advantageous to turn the data cache burst
on when scaling a 128 dots wide rectangle to a rectangle wider than 8 dots
(i.e. with an X scaling factor greater than 1/16); with a scaling factor of
1/16 or less the difference proved to be minimal when both the source and
destination rectangles were 256 dots tall; considering that turning the data
cache burst off would therefore be advantageous only with very narrow and
tall rectangles (which are uncommon and intrinsically rather inexpensive),
it is not worth it to manage the data cache burst inside the scaling
routines.
The scaling routines fit any rectangle from a texture into a rectangle of any
size and ratio of another texture with nearest-neighbor matching. Optionally,
they can flip the rectangles horizontally and/or vertically, and treat as
transparent the dots of a specific color (color-keying) or of color 0 (zero-
keying).
Color/zero-keying allows to render graphics of arbitrary shapes without masks
(which saves RAM and CPU cycles). Thanks to the fact that PED81C graphics always
use at most 81 colors, there are 256-81 = 175 colors that can be used for color-
keying without causing any visual loss.
For performance reasons, there are the 3 separate routines.
---------------------------------------------- ----------------------------------
OTHER TECHNICAL NOTES
* Logic and rendering are totally asynchronous: the logic runs always at 50 Hz
and the rendering never stops (unless it reaches 50 fps and the fps limit is
on), thus exploiting the machine's full potential.
* The screen is triple-buffered.
* When 2 rasters can be allocated in FAST RAM:
1. the graphics are rendered always to the available raster in FAST RAM;
2. after the rendering has completed and as soon as the bottom rasterline has
has been displayed, the rendered raster is copied as quickly as possible
to the raster in CHIP RAM (which is the one that gets displayed).
The copy successfully races the beam (on the expanded Amiga 1200 mentioned in
the PERFORMANCE section, it requires about 57 rasterlines during the vertical
blanking and 35 rasterlines during the fetching of the top rasterlines), so no
tearing occurs.
Such method yields a faster performance than rendering directly to a raster in
CHIP RAM (especially when there is overdraw and/or data gets also read from
the raster).
* The screen resolution is 1020x256 SHRES pixels, which correspond to 255x256
LORES-sized physical dots and to 128x256 logical dots.
* The code is 100% assembly.
* The program takes over the system entirely and returns to AmigaOS cleanly.
Commento modificato il 02/04/2024 alle ore 22:24:34
RETREAM - sogni retro per Amiga, Commodore 64 e PC
SysAdmin Unix/Linux - fiero o folle possessore di un AmigaOne
Post inviati: 3550
Commento 30
VagaPPC
4 Aprile 2024 16:27:15
Veramente incredibile.
Ma queste cose sono usufruibili con la tua libreria AMOS?
Old System Amiga 500,1200, A4000/60 PowerPPC, CybervisionPPC, SUN Ultra5, PowerMAC G4 450Mhz 1Gb
Post inviati: 703
Commento 31
saimo
4 Aprile 2024 18:38:53
VagaPPC
Veramente incredibile.
Grazie!
VagaPPC
Ma queste cose sono usufruibili con la tua libreria AMOS?
In teoria con ALS si potrebbe aprire uno schermo PED81C con sovrapposizione di uno o più altri layer (anche PED81C) e poi si potrebbero chiamare le routine di ridimensionamento in linguaggio macchina per gli effetti. In pratica, però, la performance crollerebbe: PED81C richiede che i bitplane siano visualizzati in SHRES, il che vuol dire che c'è un massiccio trasferimento di dati su bus CHIP (per capirci: uno schermo PED81C richiede il doppio della memoria e del trasferimento dati su bus CHIP di uno schermo di dimensioni equivalenti in LORES a 256 colori), per cui andare addirittura a sovrapporre altri layer altrettanto pesanti saturerebbe il bus e quindi rallenterebbe l'accesso allo stesso di CPU, Blitter e Copper. Tutto ciò varrebbe anche se si programmasse tutto in assembly (con cui, ovviamente, si possono creare layer così come fa ALS).
Commento modificato il 04/04/2024 alle ore 18:39:20
RETREAM - sogni retro per Amiga, Commodore 64 e PC