|
1 Roadmap |
|
2 |
|
3 - Move all the fetchers etc. into pixman-image to make pixman-compose.c |
|
4 less intimidating. |
|
5 |
|
6 DONE |
|
7 |
|
8 - Make combiners for unified alpha take a mask argument. That way |
|
9 we won't need two separate paths for unified vs component in the |
|
10 general compositing code. |
|
11 |
|
12 DONE, except that the Altivec code needs to be updated. Luca is |
|
13 looking into that. |
|
14 |
|
15 - Delete separate 'unified alpha' path |
|
16 |
|
17 DONE |
|
18 |
|
19 - Split images into their own files |
|
20 |
|
21 DONE |
|
22 |
|
23 - Split the gradient walker code out into its own file |
|
24 |
|
25 DONE |
|
26 |
|
27 - Add scanline getters per image |
|
28 |
|
29 DONE |
|
30 |
|
31 - Generic 64 bit fetcher |
|
32 |
|
33 DONE |
|
34 |
|
35 - Split fast path tables into their respective architecture dependent |
|
36 files. |
|
37 |
|
38 See "Render Algorithm" below for rationale |
|
39 |
|
40 Images will eventually have these virtual functions: |
|
41 |
|
42 get_scanline() |
|
43 get_scanline_wide() |
|
44 get_pixel() |
|
45 get_pixel_wide() |
|
46 get_untransformed_pixel() |
|
47 get_untransformed_pixel_wide() |
|
48 get_unfiltered_pixel() |
|
49 get_unfiltered_pixel_wide() |
|
50 |
|
51 store_scanline() |
|
52 store_scanline_wide() |
|
53 |
|
54 1. |
|
55 |
|
56 Initially we will just have get_scanline() and get_scanline_wide(); |
|
57 these will be based on the ones in pixman-compose. Hopefully this will |
|
58 reduce the complexity in pixman_composite_rect_general(). |
|
59 |
|
60 Note that there is access considerations - the compose function is |
|
61 being compiled twice. |
|
62 |
|
63 |
|
64 2. |
|
65 |
|
66 Split image types into their own source files. Export noop virtual |
|
67 reinit() call. Call this whenever a property of the image changes. |
|
68 |
|
69 |
|
70 3. |
|
71 |
|
72 Split the get_scanline() call into smaller functions that are |
|
73 initialized by the reinit() call. |
|
74 |
|
75 The Render Algorithm: |
|
76 (first repeat, then filter, then transform, then clip) |
|
77 |
|
78 Starting from a destination pixel (x, y), do |
|
79 |
|
80 1 x = x - xDst + xSrc |
|
81 y = y - yDst + ySrc |
|
82 |
|
83 2 reject pixel that is outside the clip |
|
84 |
|
85 This treats clipping as something that happens after |
|
86 transformation, which I think is correct for client clips. For |
|
87 hierarchy clips it is wrong, but who really cares? Without |
|
88 GraphicsExposes hierarchy clips are basically irrelevant. Yes, |
|
89 you could imagine cases where the pixels of a subwindow of a |
|
90 redirected, transformed window should be treated as |
|
91 transparent. I don't really care |
|
92 |
|
93 Basically, I think the render spec should say that pixels that |
|
94 are unavailable due to the hierarcy have undefined content, |
|
95 and that GraphicsExposes are not generated. Ie., basically |
|
96 that using non-redirected windows as sources is fail. This is |
|
97 at least consistent with the current implementation and we can |
|
98 update the spec later if someone makes it work. |
|
99 |
|
100 The implication for render is that it should stop passing the |
|
101 hierarchy clip to pixman. In pixman, if a souce image has a |
|
102 clip it should be used in computing the composite region and |
|
103 nowhere else, regardless of what "has_client_clip" says. The |
|
104 default should be for there to not be any clip. |
|
105 |
|
106 I would really like to get rid of the client clip as well for |
|
107 source images, but unfortunately there is at least one |
|
108 application in the wild that uses them. |
|
109 |
|
110 3 Transform pixel: (x, y) = T(x, y) |
|
111 |
|
112 4 Call p = GetUntransformedPixel (x, y) |
|
113 |
|
114 5 If the image has an alpha map, then |
|
115 |
|
116 Call GetUntransformedPixel (x, y) on the alpha map |
|
117 |
|
118 add resulting alpha channel to p |
|
119 |
|
120 return p |
|
121 |
|
122 Where GetUnTransformedPixel is: |
|
123 |
|
124 6 switch (filter) |
|
125 { |
|
126 case NEAREST: |
|
127 return GetUnfilteredPixel (x, y); |
|
128 break; |
|
129 |
|
130 case BILINEAR: |
|
131 return GetUnfilteredPixel (...) // 4 times |
|
132 break; |
|
133 |
|
134 case CONVOLUTION: |
|
135 return GetUnfilteredPixel (...) // as many times as necessary. |
|
136 break; |
|
137 } |
|
138 |
|
139 Where GetUnfilteredPixel (x, y) is |
|
140 |
|
141 7 switch (repeat) |
|
142 { |
|
143 case REPEAT_NORMAL: |
|
144 case REPEAT_PAD: |
|
145 case REPEAT_REFLECT: |
|
146 // adjust x, y as appropriate |
|
147 break; |
|
148 |
|
149 case REPEAT_NONE: |
|
150 if (x, y) is outside image bounds |
|
151 return 0; |
|
152 break; |
|
153 } |
|
154 |
|
155 return GetRawPixel(x, y) |
|
156 |
|
157 Where GetRawPixel (x, y) is |
|
158 |
|
159 8 Compute the pixel in question, depending on image type. |
|
160 |
|
161 For gradients, repeat has a totally different meaning, so |
|
162 UnfilteredPixel() and RawPixel() must be the same function so that |
|
163 gradients can do their own repeat algorithm. |
|
164 |
|
165 So, the GetRawPixel |
|
166 |
|
167 for bits must deal with repeats |
|
168 for gradients must deal with repeats (differently) |
|
169 for solids, should ignore repeats. |
|
170 |
|
171 for polygons, when we add them, either ignore repeats or do |
|
172 something similar to bits (in which case, we may want an extra |
|
173 layer of indirection to modify the coordinates). |
|
174 |
|
175 It is then possible to build things like "get scanline" or "get tile" on |
|
176 top of this. In the simplest case, just repeatedly calling GetPixel() |
|
177 would work, but specialized get_scanline()s or get_tile()s could be |
|
178 plugged in for common cases. |
|
179 |
|
180 By not plugging anything in for images with access functions, we only |
|
181 have to compile the pixel functions twice, not the scanline functions. |
|
182 |
|
183 And we can get rid of fetchers for the bizarre formats that no one |
|
184 uses. Such as b2g3r3 etc. r1g2b1? Seriously? It is also worth |
|
185 considering a generic format based pixel fetcher for these edge cases. |
|
186 |
|
187 Since the actual routines depend on the image attributes, the images |
|
188 must be notified when those change and update their function pointers |
|
189 appropriately. So there should probably be a virtual function called |
|
190 (* reinit) or something like that. |
|
191 |
|
192 There will also be wide fetchers for both pixels and lines. The line |
|
193 fetcher will just call the wide pixel fetcher. The wide pixel fetcher |
|
194 will just call expand, except for 10 bit formats. |
|
195 |
|
196 Rendering pipeline: |
|
197 |
|
198 Drawable: |
|
199 0. if (picture has alpha map) |
|
200 0.1. Position alpha map according to the alpha_x/alpha_y |
|
201 0.2. Where the two drawables intersect, the alpha channel |
|
202 Replace the alpha channel of source with the one |
|
203 from the alpha map. Replacement only takes place |
|
204 in the intersection of the two drawables' geometries. |
|
205 1. Repeat the drawable according to the repeat attribute |
|
206 2. Reconstruct a continuous image according to the filter |
|
207 3. Transform according to the transform attribute |
|
208 4. Position image such that src_x, src_y is over dst_x, dst_y |
|
209 5. Sample once per destination pixel |
|
210 6. Clip. If a pixel is not within the source clip, then no |
|
211 compositing takes place at that pixel. (Ie., it's *not* |
|
212 treated as 0). |
|
213 |
|
214 Sampling a drawable: |
|
215 |
|
216 - If the channel does not have an alpha channel, the pixels in it |
|
217 are treated as opaque. |
|
218 |
|
219 Note on reconstruction: |
|
220 |
|
221 - The top left pixel has coordinates (0.5, 0.5) and pixels are |
|
222 spaced 1 apart. |
|
223 |
|
224 Gradient: |
|
225 1. Unless gradient type is conical, repeat the underlying (0, 1) |
|
226 gradient according to the repeat attribute |
|
227 2. Integrate the gradient across the plane according to type. |
|
228 3. Transform according to transform attribute |
|
229 4. Position gradient |
|
230 5. Sample once per destination pixel. |
|
231 6. Clip |
|
232 |
|
233 Solid Fill: |
|
234 1. Repeat has no effect |
|
235 2. Image is already continuous and defined for the entire plane |
|
236 3. Transform has no effect |
|
237 4. Positioning has no effect |
|
238 5. Sample once per destination pixel. |
|
239 6. Clip |
|
240 |
|
241 Polygon: |
|
242 1. Repeat has no effect |
|
243 2. Image is already continuous and defined on the whole plane |
|
244 3. Transform according to transform attribute |
|
245 4. Position image |
|
246 5. Supersample 15x17 per destination pixel. |
|
247 6. Clip |
|
248 |
|
249 Possibly interesting additions: |
|
250 - More general transformations, such as warping, or general |
|
251 shading. |
|
252 |
|
253 - Shader image where a function is called to generate the |
|
254 pixel (ie., uploading assembly code). |
|
255 |
|
256 - Resampling kernels |
|
257 |
|
258 In principle the polygon image uses a 15x17 box filter for |
|
259 resampling. If we allow general resampling filters, then we |
|
260 get all the various antialiasing types for free. |
|
261 |
|
262 Bilinear downsampling looks terrible and could be much |
|
263 improved by a resampling filter. NEAREST reconstruction |
|
264 combined with a box resampling filter is what GdkPixbuf |
|
265 does, I believe. |
|
266 |
|
267 Useful for high frequency gradients as well. |
|
268 |
|
269 (Note that the difference between a reconstruction and a |
|
270 resampling filter is mainly where in the pipeline they |
|
271 occur. High quality resampling should use a correctly |
|
272 oriented kernel so it should happen after transformation. |
|
273 |
|
274 An implementation can transform the resampling kernel and |
|
275 convolve it with the reconstruction if it so desires, but it |
|
276 will need to deal with the fact that the resampling kernel |
|
277 will not necessarily be pixel aligned. |
|
278 |
|
279 "Output kernels" |
|
280 |
|
281 One could imagine doing the resampling after compositing, |
|
282 ie., for each destination pixel sample each source image 16 |
|
283 times, then composite those subpixels individually, then |
|
284 finally apply a kernel. |
|
285 |
|
286 However, this is effectively the same as full screen |
|
287 antialiasing, which is a simpler way to think about it. So |
|
288 resampling kernels may make sense for individual images, but |
|
289 not as a post-compositing step. |
|
290 |
|
291 Fullscreen AA is inefficient without chained compositing |
|
292 though. Consider an (image scaled up to oversample size IN |
|
293 some polygon) scaled down to screen size. With the current |
|
294 implementation, there will be a huge temporary. With chained |
|
295 compositing, the whole thing ends up being equivalent to the |
|
296 output kernel from above. |
|
297 |
|
298 - Color space conversion |
|
299 |
|
300 The complete model here is that each surface has a color |
|
301 space associated with it and that the compositing operation |
|
302 also has one associated with it. Note also that gradients |
|
303 should have associcated colorspaces. |
|
304 |
|
305 - Dithering |
|
306 |
|
307 If people dither something that is already dithered, it will |
|
308 look terrible, but don't do that, then. (Dithering happens |
|
309 after resampling if at all - what is the relationship |
|
310 with color spaces? Presumably dithering should happen in linear |
|
311 intensity space). |
|
312 |
|
313 - Floating point surfaces, 16, 32 and possibly 64 bit per |
|
314 channel. |
|
315 |
|
316 Maybe crack: |
|
317 |
|
318 - Glyph polygons |
|
319 |
|
320 If glyphs could be given as polygons, they could be |
|
321 positioned and rasterized more accurately. The glyph |
|
322 structure would need subpixel positioning though. |
|
323 |
|
324 - Luminance vs. coverage for the alpha channel |
|
325 |
|
326 Whether the alpha channel should be interpreted as luminance |
|
327 modulation or as coverage (intensity modulation). This is a |
|
328 bit of a departure from the rendering model though. It could |
|
329 also be considered whether it should be possible to have |
|
330 both channels in the same drawable. |
|
331 |
|
332 - Alternative for component alpha |
|
333 |
|
334 - Set component-alpha on the output image. |
|
335 |
|
336 - This means each of the components are sampled |
|
337 independently and composited in the corresponding |
|
338 channel only. |
|
339 |
|
340 - Have 3 x oversampled mask |
|
341 |
|
342 - Scale it down by 3 horizontally, with [ 1/3, 1/3, 1/3 ] |
|
343 resampling filter. |
|
344 |
|
345 Is this equivalent to just using a component alpha mask? |
|
346 |
|
347 Incompatible changes: |
|
348 |
|
349 - Gradients could be specified with premultiplied colors. (You |
|
350 can use a mask to get things like gradients from solid red to |
|
351 transparent red. |
|
352 |
|
353 Refactoring pixman |
|
354 |
|
355 The pixman code is not particularly nice to put it mildly. Among the |
|
356 issues are |
|
357 |
|
358 - inconsistent naming style (fb vs Fb, camelCase vs |
|
359 underscore_naming). Sometimes there is even inconsistency *within* |
|
360 one name. |
|
361 |
|
362 fetchProc32 ACCESS(pixman_fetchProcForPicture32) |
|
363 |
|
364 may be one of the uglies names ever created. |
|
365 |
|
366 coding style: |
|
367 use the one from cairo except that pixman uses this brace style: |
|
368 |
|
369 while (blah) |
|
370 { |
|
371 } |
|
372 |
|
373 Format do while like this: |
|
374 |
|
375 do |
|
376 { |
|
377 |
|
378 } |
|
379 while (...); |
|
380 |
|
381 - PIXMAN_COMPOSITE_RECT_GENERAL() is horribly complex |
|
382 |
|
383 - switch case logic in pixman-access.c |
|
384 |
|
385 Instead it would be better to just store function pointers in the |
|
386 image objects themselves, |
|
387 |
|
388 get_pixel() |
|
389 get_scanline() |
|
390 |
|
391 - Much of the scanline fetching code is for formats that no one |
|
392 ever uses. a2r2g2b2 anyone? |
|
393 |
|
394 It would probably be worthwhile having a generic fetcher for any |
|
395 pixman format whatsoever. |
|
396 |
|
397 - Code related to particular image types should be split into individual |
|
398 files. |
|
399 |
|
400 pixman-bits-image.c |
|
401 pixman-linear-gradient-image.c |
|
402 pixman-radial-gradient-image.c |
|
403 pixman-solid-image.c |
|
404 |
|
405 - Fast path code should be split into files based on architecture: |
|
406 |
|
407 pixman-mmx-fastpath.c |
|
408 pixman-sse2-fastpath.c |
|
409 pixman-c-fastpath.c |
|
410 |
|
411 etc. |
|
412 |
|
413 Each of these files should then export a fastpath table, which would |
|
414 be declared in pixman-private.h. This should allow us to get rid |
|
415 of the pixman-mmx.h files. |
|
416 |
|
417 The fast path table should describe each fast path. Ie there should |
|
418 be bitfields indicating what things the fast path can handle, rather than |
|
419 like now where it is only allowed to take one format per src/mask/dest. Ie., |
|
420 |
|
421 { |
|
422 FAST_a8r8g8b8 | FAST_x8r8g8b8, |
|
423 FAST_null, |
|
424 FAST_x8r8g8b8, |
|
425 FAST_repeat_normal | FAST_repeat_none, |
|
426 the_fast_path |
|
427 } |
|
428 |
|
429 There should then be *one* file that implements pixman_image_composite(). |
|
430 This should do this: |
|
431 |
|
432 optimize_operator(); |
|
433 |
|
434 convert 1x1 repeat to solid (actually this should be done at |
|
435 image creation time). |
|
436 |
|
437 is there a useful fastpath? |
|
438 |
|
439 There should be a file called pixman-cpu.c that contains all the |
|
440 architecture specific stuff to detect what CPU features we have. |
|
441 |
|
442 Issues that must be kept in mind: |
|
443 |
|
444 - we need accessor code to be preserved |
|
445 |
|
446 - maybe there should be a "store_scanline" too? |
|
447 |
|
448 Is this sufficient? |
|
449 |
|
450 We should preserve the optimization where the |
|
451 compositing happens directly in the destination |
|
452 whenever possible. |
|
453 |
|
454 - It should be possible to create GPU samplers from the |
|
455 images. |
|
456 |
|
457 The "horizontal" classification should be a bit in the image, the |
|
458 "vertical" classification should just happen inside the gradient |
|
459 file. Note though that |
|
460 |
|
461 (a) these will change if the tranformation/repeat changes. |
|
462 |
|
463 (b) at the moment the optimization for linear gradients |
|
464 takes the source rectangle into account. Presumably |
|
465 this is to also optimize the case where the gradient |
|
466 is close enough to horizontal? |
|
467 |
|
468 Who is responsible for repeats? In principle it should be the scanline |
|
469 fetch. Right now NORMAL repeats are handled by walk_composite_region() |
|
470 while other repeats are handled by the scanline code. |
|
471 |
|
472 |
|
473 (Random note on filtering: do you filter before or after |
|
474 transformation? Hardware is going to filter after transformation; |
|
475 this is also what pixman does currently). It's not completely clear |
|
476 what filtering *after* transformation means. One thing that might look |
|
477 good would be to do *supersampling*, ie., compute multiple subpixels |
|
478 per destination pixel, then average them together. |