michael@0: Roadmap
michael@0: 
michael@0: - Move all the fetchers etc. into pixman-image to make pixman-compose.c
michael@0:   less intimidating.
michael@0: 
michael@0:   DONE
michael@0: 
michael@0: - Make combiners for unified alpha take a mask argument. That way
michael@0:   we won't need two separate paths for unified vs component in the
michael@0:   general compositing code.
michael@0: 
michael@0:   DONE, except that the Altivec code needs to be updated. Luca is
michael@0:   looking into that.
michael@0: 
michael@0: - Delete separate 'unified alpha' path
michael@0:  
michael@0:   DONE
michael@0: 
michael@0: - Split images into their own files
michael@0: 
michael@0:   DONE
michael@0: 
michael@0: - Split the gradient walker code out into its own file
michael@0: 
michael@0:   DONE
michael@0: 
michael@0: - Add scanline getters per image
michael@0: 
michael@0:   DONE
michael@0: 
michael@0: - Generic 64 bit fetcher 
michael@0: 
michael@0:   DONE
michael@0: 
michael@0: - Split fast path tables into their respective architecture dependent
michael@0:   files.
michael@0: 
michael@0: See "Render Algorithm" below for rationale
michael@0: 
michael@0: Images will eventually have these virtual functions:
michael@0: 
michael@0:        get_scanline()
michael@0:        get_scanline_wide()
michael@0:        get_pixel()
michael@0:        get_pixel_wide()
michael@0:        get_untransformed_pixel()
michael@0:        get_untransformed_pixel_wide()
michael@0:        get_unfiltered_pixel()
michael@0:        get_unfiltered_pixel_wide()
michael@0: 
michael@0:        store_scanline()
michael@0:        store_scanline_wide()
michael@0: 
michael@0: 1.
michael@0: 
michael@0: Initially we will just have get_scanline() and get_scanline_wide();
michael@0: these will be based on the ones in pixman-compose. Hopefully this will
michael@0: reduce the complexity in pixman_composite_rect_general().
michael@0: 
michael@0: Note that there is access considerations - the compose function is
michael@0: being compiled twice.
michael@0: 
michael@0: 
michael@0: 2.
michael@0: 
michael@0: Split image types into their own source files. Export noop virtual
michael@0: reinit() call.  Call this whenever a property of the image changes.
michael@0: 
michael@0: 
michael@0: 3. 
michael@0: 
michael@0: Split the get_scanline() call into smaller functions that are
michael@0: initialized by the reinit() call.
michael@0: 
michael@0: The Render Algorithm:
michael@0: 	(first repeat, then filter, then transform, then clip)
michael@0: 
michael@0: Starting from a destination pixel (x, y), do
michael@0: 
michael@0: 	1 x = x - xDst + xSrc
michael@0: 	  y = y - yDst + ySrc
michael@0: 
michael@0: 	2 reject pixel that is outside the clip
michael@0: 
michael@0: 	This treats clipping as something that happens after
michael@0: 	transformation, which I think is correct for client clips. For
michael@0: 	hierarchy clips it is wrong, but who really cares? Without
michael@0: 	GraphicsExposes hierarchy clips are basically irrelevant. Yes,
michael@0: 	you could imagine cases where the pixels of a subwindow of a
michael@0: 	redirected, transformed window should be treated as
michael@0: 	transparent. I don't really care
michael@0: 
michael@0: 	Basically, I think the render spec should say that pixels that
michael@0: 	are unavailable due to the hierarcy have undefined content,
michael@0: 	and that GraphicsExposes are not generated. Ie., basically
michael@0: 	that using non-redirected windows as sources is fail. This is
michael@0: 	at least consistent with the current implementation and we can
michael@0: 	update the spec later if someone makes it work.
michael@0: 
michael@0: 	The implication for render is that it should stop passing the
michael@0: 	hierarchy clip to pixman. In pixman, if a souce image has a
michael@0: 	clip it should be used in computing the composite region and
michael@0: 	nowhere else, regardless of what "has_client_clip" says. The
michael@0: 	default should be for there to not be any clip.
michael@0: 
michael@0: 	I would really like to get rid of the client clip as well for
michael@0: 	source images, but unfortunately there is at least one
michael@0: 	application in the wild that uses them.
michael@0: 
michael@0: 	3 Transform pixel: (x, y) = T(x, y)
michael@0: 
michael@0: 	4 Call p = GetUntransformedPixel (x, y)
michael@0: 
michael@0: 	5 If the image has an alpha map, then
michael@0: 
michael@0: 		Call GetUntransformedPixel (x, y) on the alpha map
michael@0: 		
michael@0: 		add resulting alpha channel to p
michael@0: 
michael@0: 	   return p
michael@0: 
michael@0: 	Where GetUnTransformedPixel is:
michael@0: 
michael@0: 	6 switch (filter)
michael@0: 	  {
michael@0: 	  case NEAREST:
michael@0: 		return GetUnfilteredPixel (x, y);
michael@0: 		break;
michael@0: 
michael@0: 	  case BILINEAR:
michael@0: 		return GetUnfilteredPixel (...) // 4 times 
michael@0: 		break;
michael@0: 
michael@0: 	  case CONVOLUTION:
michael@0: 		return GetUnfilteredPixel (...) // as many times as necessary.
michael@0: 		break;
michael@0: 	  }
michael@0: 
michael@0: 	Where GetUnfilteredPixel (x, y) is
michael@0: 
michael@0: 	7 switch (repeat)
michael@0: 	   {
michael@0: 	   case REPEAT_NORMAL:
michael@0: 	   case REPEAT_PAD:
michael@0: 	   case REPEAT_REFLECT:
michael@0: 		// adjust x, y as appropriate
michael@0: 		break;
michael@0: 
michael@0: 	   case REPEAT_NONE:
michael@0: 	        if (x, y) is outside image bounds
michael@0: 		     return 0;
michael@0: 		break;
michael@0: 	   }
michael@0: 
michael@0: 	   return GetRawPixel(x, y)
michael@0: 
michael@0: 	Where GetRawPixel (x, y) is
michael@0: 
michael@0: 	8 Compute the pixel in question, depending on image type.
michael@0: 
michael@0: For gradients, repeat has a totally different meaning, so
michael@0: UnfilteredPixel() and RawPixel() must be the same function so that
michael@0: gradients can do their own repeat algorithm.
michael@0: 
michael@0: So, the GetRawPixel
michael@0: 
michael@0: 	for bits must deal with repeats
michael@0: 	for gradients must deal with repeats (differently)
michael@0: 	for solids, should ignore repeats.
michael@0: 
michael@0: 	for polygons, when we add them, either ignore repeats or do
michael@0: 	something similar to bits (in which case, we may want an extra
michael@0: 	layer of indirection to modify the coordinates).
michael@0: 
michael@0: It is then possible to build things like "get scanline" or "get tile" on
michael@0: top of this. In the simplest case, just repeatedly calling GetPixel()
michael@0: would work, but specialized get_scanline()s or get_tile()s could be
michael@0: plugged in for common cases. 
michael@0: 
michael@0: By not plugging anything in for images with access functions, we only
michael@0: have to compile the pixel functions twice, not the scanline functions.
michael@0: 
michael@0: And we can get rid of fetchers for the bizarre formats that no one
michael@0: uses. Such as b2g3r3 etc. r1g2b1? Seriously? It is also worth
michael@0: considering a generic format based pixel fetcher for these edge cases.
michael@0: 
michael@0: Since the actual routines depend on the image attributes, the images
michael@0: must be notified when those change and update their function pointers
michael@0: appropriately. So there should probably be a virtual function called
michael@0: (* reinit) or something like that.
michael@0: 
michael@0: There will also be wide fetchers for both pixels and lines. The line
michael@0: fetcher will just call the wide pixel fetcher. The wide pixel fetcher
michael@0: will just call expand, except for 10 bit formats.
michael@0: 
michael@0: Rendering pipeline:
michael@0: 
michael@0: Drawable:
michael@0: 	0. if (picture has alpha map)
michael@0: 		0.1. Position alpha map according to the alpha_x/alpha_y
michael@0: 	        0.2. Where the two drawables intersect, the alpha channel
michael@0: 		     Replace the alpha channel of source with the one
michael@0: 		     from the alpha map. Replacement only takes place
michael@0: 		     in the intersection of the two drawables' geometries.
michael@0: 	1. Repeat the drawable according to the repeat attribute
michael@0: 	2. Reconstruct a continuous image according to the filter
michael@0: 	3. Transform according to the transform attribute
michael@0: 	4. Position image such that src_x, src_y is over dst_x, dst_y
michael@0: 	5. Sample once per destination pixel 
michael@0: 	6. Clip. If a pixel is not within the source clip, then no
michael@0: 	   compositing takes place at that pixel. (Ie., it's *not*
michael@0: 	   treated as 0).
michael@0: 
michael@0: 	Sampling a drawable: 
michael@0: 
michael@0: 	- If the channel does not have an alpha channel, the pixels in it
michael@0: 	  are treated as opaque.
michael@0: 
michael@0: 	Note on reconstruction:
michael@0: 
michael@0: 	- The top left pixel has coordinates (0.5, 0.5) and pixels are
michael@0: 	  spaced 1 apart.
michael@0: 
michael@0: Gradient:
michael@0: 	1. Unless gradient type is conical, repeat the underlying (0, 1)
michael@0: 		gradient according to the repeat attribute
michael@0: 	2. Integrate the gradient across the plane according to type.
michael@0: 	3. Transform according to transform attribute
michael@0: 	4. Position gradient 
michael@0: 	5. Sample once per destination pixel.
michael@0:  	6. Clip
michael@0: 
michael@0: Solid Fill:
michael@0: 	1. Repeat has no effect
michael@0: 	2. Image is already continuous and defined for the entire plane
michael@0: 	3. Transform has no effect
michael@0: 	4. Positioning has no effect
michael@0: 	5. Sample once per destination pixel.
michael@0: 	6. Clip
michael@0: 
michael@0: Polygon:
michael@0: 	1. Repeat has no effect
michael@0: 	2. Image is already continuous and defined on the whole plane
michael@0: 	3. Transform according to transform attribute
michael@0: 	4. Position image
michael@0: 	5. Supersample 15x17 per destination pixel.
michael@0: 	6. Clip
michael@0: 
michael@0: Possibly interesting additions:
michael@0: 	- More general transformations, such as warping, or general
michael@0: 	  shading.
michael@0: 
michael@0: 	- Shader image where a function is called to generate the
michael@0:           pixel (ie., uploading assembly code).
michael@0: 
michael@0: 	- Resampling kernels
michael@0: 
michael@0: 	  In principle the polygon image uses a 15x17 box filter for
michael@0: 	  resampling. If we allow general resampling filters, then we
michael@0: 	  get all the various antialiasing types for free. 
michael@0: 
michael@0: 	  Bilinear downsampling looks terrible and could be much 
michael@0: 	  improved by a resampling filter. NEAREST reconstruction
michael@0: 	  combined with a box resampling filter is what GdkPixbuf
michael@0: 	  does, I believe.
michael@0: 
michael@0: 	  Useful for high frequency gradients as well.
michael@0: 
michael@0: 	  (Note that the difference between a reconstruction and a
michael@0: 	  resampling filter is mainly where in the pipeline they
michael@0: 	  occur. High quality resampling should use a correctly
michael@0: 	  oriented kernel so it should happen after transformation.
michael@0: 
michael@0: 	  An implementation can transform the resampling kernel and
michael@0: 	  convolve it with the reconstruction if it so desires, but it
michael@0: 	  will need to deal with the fact that the resampling kernel
michael@0: 	  will not necessarily be pixel aligned.
michael@0: 
michael@0: 	  "Output kernels"
michael@0: 
michael@0: 	  One could imagine doing the resampling after compositing,
michael@0: 	  ie., for each destination pixel sample each source image 16
michael@0: 	  times, then composite those subpixels individually, then
michael@0: 	  finally apply a kernel.
michael@0: 
michael@0: 	  However, this is effectively the same as full screen
michael@0: 	  antialiasing, which is a simpler way to think about it. So
michael@0: 	  resampling kernels may make sense for individual images, but
michael@0: 	  not as a post-compositing step.
michael@0: 	  
michael@0: 	  Fullscreen AA is inefficient without chained compositing
michael@0: 	  though. Consider an (image scaled up to oversample size IN
michael@0: 	  some polygon) scaled down to screen size. With the current
michael@0: 	  implementation, there will be a huge temporary. With chained
michael@0: 	  compositing, the whole thing ends up being equivalent to the
michael@0: 	  output kernel from above.
michael@0: 
michael@0: 	- Color space conversion
michael@0: 
michael@0: 	  The complete model here is that each surface has a color
michael@0: 	  space associated with it and that the compositing operation
michael@0: 	  also has one associated with it. Note also that gradients
michael@0: 	  should have associcated colorspaces.
michael@0: 
michael@0: 	- Dithering
michael@0: 
michael@0: 	  If people dither something that is already dithered, it will
michael@0: 	  look terrible, but don't do that, then. (Dithering happens
michael@0: 	  after resampling if at all - what is the relationship
michael@0: 	  with color spaces? Presumably dithering should happen in linear
michael@0: 	  intensity space).
michael@0: 
michael@0: 	- Floating point surfaces, 16, 32 and possibly 64 bit per
michael@0: 	  channel.
michael@0: 
michael@0: 	Maybe crack:
michael@0: 
michael@0: 	- Glyph polygons
michael@0: 
michael@0: 	  If glyphs could be given as polygons, they could be
michael@0: 	  positioned and rasterized more accurately. The glyph
michael@0: 	  structure would need subpixel positioning though.
michael@0: 
michael@0: 	- Luminance vs. coverage for the alpha channel
michael@0: 
michael@0: 	  Whether the alpha channel should be interpreted as luminance
michael@0:           modulation or as coverage (intensity modulation). This is a
michael@0:           bit of a departure from the rendering model though. It could
michael@0: 	  also be considered whether it should be possible to have 
michael@0: 	  both channels in the same drawable.
michael@0: 
michael@0: 	- Alternative for component alpha
michael@0: 
michael@0: 	  - Set component-alpha on the output image.
michael@0: 
michael@0: 	    - This means each of the components are sampled
michael@0: 	      independently and composited in the corresponding
michael@0: 	      channel only.
michael@0: 
michael@0: 	  - Have 3 x oversampled mask
michael@0: 
michael@0: 	  - Scale it down by 3 horizontally, with [ 1/3, 1/3, 1/3 ]
michael@0:             resampling filter. 
michael@0: 
michael@0: 	    Is this equivalent to just using a component alpha mask?
michael@0: 
michael@0: 	Incompatible changes:
michael@0: 
michael@0: 	- Gradients could be specified with premultiplied colors. (You
michael@0: 	  can use a mask to get things like gradients from solid red to
michael@0: 	  transparent red.
michael@0: 
michael@0: Refactoring pixman
michael@0: 
michael@0: The pixman code is not particularly nice to put it mildly. Among the
michael@0: issues are
michael@0: 
michael@0: - inconsistent naming style (fb vs Fb, camelCase vs
michael@0:   underscore_naming). Sometimes there is even inconsistency *within*
michael@0:   one name.
michael@0: 
michael@0:       fetchProc32 ACCESS(pixman_fetchProcForPicture32)
michael@0: 
michael@0:   may be one of the uglies names ever created.
michael@0: 
michael@0:   coding style: 
michael@0:   	 use the one from cairo except that pixman uses this brace style:
michael@0: 	 
michael@0: 		while (blah)
michael@0: 		{
michael@0: 		}
michael@0: 
michael@0: 	Format do while like this:
michael@0: 
michael@0: 	       do 
michael@0: 	       {
michael@0: 
michael@0: 	       } 
michael@0: 	       while (...);
michael@0: 
michael@0: - PIXMAN_COMPOSITE_RECT_GENERAL() is horribly complex
michael@0: 
michael@0: - switch case logic in pixman-access.c
michael@0: 
michael@0:   Instead it would be better to just store function pointers in the
michael@0:   image objects themselves,
michael@0: 
michael@0:   	get_pixel()
michael@0: 	get_scanline()
michael@0: 
michael@0: - Much of the scanline fetching code is for formats that no one 
michael@0:   ever uses. a2r2g2b2 anyone?
michael@0: 
michael@0:   It would probably be worthwhile having a generic fetcher for any
michael@0:   pixman format whatsoever.
michael@0: 
michael@0: - Code related to particular image types should be split into individual
michael@0:   files.
michael@0: 
michael@0: 	pixman-bits-image.c
michael@0: 	pixman-linear-gradient-image.c
michael@0: 	pixman-radial-gradient-image.c
michael@0: 	pixman-solid-image.c
michael@0: 
michael@0: - Fast path code should be split into files based on architecture:
michael@0: 
michael@0:        pixman-mmx-fastpath.c
michael@0:        pixman-sse2-fastpath.c
michael@0:        pixman-c-fastpath.c
michael@0: 
michael@0:        etc.
michael@0: 
michael@0:   Each of these files should then export a fastpath table, which would
michael@0:   be declared in pixman-private.h. This should allow us to get rid
michael@0:   of the pixman-mmx.h files.
michael@0: 
michael@0:   The fast path table should describe each fast path. Ie there should
michael@0:   be bitfields indicating what things the fast path can handle, rather than
michael@0:   like now where it is only allowed to take one format per src/mask/dest. Ie., 
michael@0: 
michael@0:   { 
michael@0:     FAST_a8r8g8b8 | FAST_x8r8g8b8,
michael@0:     FAST_null,
michael@0:     FAST_x8r8g8b8,
michael@0:     FAST_repeat_normal | FAST_repeat_none,
michael@0:     the_fast_path
michael@0:   }
michael@0: 
michael@0: There should then be *one* file that implements pixman_image_composite(). 
michael@0: This should do this:
michael@0: 
michael@0:      optimize_operator();
michael@0: 
michael@0:      convert 1x1 repeat to solid (actually this should be done at
michael@0:      image creation time).
michael@0:      
michael@0:      is there a useful fastpath?
michael@0: 
michael@0: There should be a file called pixman-cpu.c that contains all the
michael@0: architecture specific stuff to detect what CPU features we have.
michael@0: 
michael@0: Issues that must be kept in mind:
michael@0: 
michael@0:        - we need accessor code to be preserved
michael@0: 
michael@0:        - maybe there should be a "store_scanline" too?
michael@0: 
michael@0:          Is this sufficient?
michael@0: 
michael@0: 	 We should preserve the optimization where the
michael@0: 	 compositing happens directly in the destination
michael@0: 	 whenever possible.
michael@0: 
michael@0: 	- It should be possible to create GPU samplers from the
michael@0: 	  images.
michael@0: 
michael@0: The "horizontal" classification should be a bit in the image, the
michael@0: "vertical" classification should just happen inside the gradient
michael@0: file. Note though that
michael@0: 
michael@0:       (a) these will change if the tranformation/repeat changes.
michael@0: 
michael@0:       (b) at the moment the optimization for linear gradients
michael@0:           takes the source rectangle into account. Presumably
michael@0: 	  this is to also optimize the case where the gradient
michael@0: 	  is close enough to horizontal?
michael@0: 
michael@0: Who is responsible for repeats? In principle it should be the scanline
michael@0: fetch. Right now NORMAL repeats are handled by walk_composite_region()
michael@0: while other repeats are handled by the scanline code.
michael@0: 
michael@0: 
michael@0: (Random note on filtering: do you filter before or after
michael@0: transformation?  Hardware is going to filter after transformation;
michael@0: this is also what pixman does currently). It's not completely clear
michael@0: what filtering *after* transformation means. One thing that might look
michael@0: good would be to do *supersampling*, ie., compute multiple subpixels
michael@0: per destination pixel, then average them together.