Mesa 21.3.0 Release Notes / 2021-11-17¶
Mesa 21.3.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 21.3.1.
Mesa 21.3.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.
Mesa 21.3.0 implements the Vulkan 1.2 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.
SHA256 checksum¶
a2753c09deef0ba14d35ae8a2ceff3fe5cd13698928c7bb62c2ec8736eb09ce1 mesa-21.3.0.tar.xz
New features¶
VK_EXT_color_write_enable on lavapipe
GL_ARB_texture_filter_anisotropic in llvmpipe
Anisotropic texture filtering in lavapipe
VK_EXT_shader_atomic_float2 on Intel and RADV.
VK_EXT_vertex_input_dynamic_state on RADV.
VK_KHR_timeline_semaphore on lavapipe
VK_EXT_external_memory_host on lavapipe
GL_AMD_pinned_memory on llvmpipe
GL 4.5 compatibility on llvmpipe
VK_EXT_primitive_topology_list_restart on RADV and lavapipe.
ES 3.2 on zink
VK_KHR_depth_stencil_resolve on lavapipe
VK_KHR_shader_integer_dot_product on RADV.
OpenGL FP16 support on llvmpipe
VK_KHR_shader_float16_int8 on lavapipe
VK_KHR_shader_subgroup_extended_types on lavapipe
VK_KHR_spirv_1_4 on lavapipe
Experimental raytracing support on RADV
VK_KHR_synchronization2 on Intel
NGG shader based culling is now enabled by default on GFX10.3 on RADV.
VK_KHR_maintenance4 on RADV
VK_KHR_format_feature_flags2 on RADV.
EGL_EXT_present_opaque on wayland
Bug fixes¶
RADV/ACO: Monster Hunter Rise Demo renders wrong results
radv: Odd lack of implicit host memory invalidation
Regression/Bisected: Crash in Godot games when steam overlay enabled
RADV: IsHelperInvocationEXT query is not considered volatile in ACO
ANV: error submitting with the same semaphore for wait and signal - regression?
[TGL+] anv: some dEQP-VK.drm_format_modifiers.* fails
Mesa 21.3rc3 - compile failure
iris: subslice assertion failure on some hardware during initialization
Final Fantasy V (Old version): Random text characters are not displayed
Diagonal rendering artifacts in Tomb Raider
dota2 black squares corruption
[hsw][bisected][regression] dEQP-VK.reconvergence.*nesting* failures
anv: dEQP-VK.wsi.wayland.<various> failures
radv_android.c: build errors due to commit 49c3a88
dEQP-EGL.functional.sharing.gles2.multithread.* regression with Iris
[radeonsi] Euro Truck Simulator 2: broken mimimap
[regression][bisected] Launching Valheim OpenGL game leads to GPU Hang
Android Meson build regression: hardware/system information apps crash on Raspberry Pi 4
radv: format properties are broken with modifiers
anv: dEQP-VK.graphicsfuzz.cov-multiple-one-iteration-loops-global-counter-write-matrices fails
iris: CCS modifier tests failing with suballocation
[RADV] For the game “World War Z: Aftermath” (Vulkan API) should used RADV_DEBUG=invariantgeom param
RADV: Resident Evil Village needs invariantgeom when NGG culling is enabled
radv: VK_EXT_vertex_input_dynamic_state
anv: dynamic state emission is busted
radeonsi: out of bounds access/compiler warning
RADV: Rendering issues in Resident Evil 2 with NGGC
GPU Hang/reset/forced reboot - latest mesa - mesa-demos/gloss
crocus: Incorrect stride when used through prime
radv: Vulkan games and demo apps are broken since “use DCC compatible with image stores for < 4K resolutions”
anv: descriptorBindingUniformBufferUpdateAfterBind feature is not supported
Cheza board reboots into another image on retry
freedreno: several regressions in org.skia.skqp.SkQPRunner
android: radv_android.c building errors after commits 9fc16b6 and 48cae11
iris: Implement memory sub-allocation
Assault Android Cactus ( STEAM AppID 250110) - Black triangles on Main menu character
World War Z - Renders red if FSR is enabled
Significant performance drop on Radeon HD 8400
turnip/a650: most VK_EXT_filter_cubic tests in dEQP-VK.texture.filtering.* fail
Ender Lilies: Turnip: Fails to render in-game
[nir][radv] Out of range shift when compiling Resident Evil Village shaders
[nir][radv] Out of range shift when compiling Resident Evil Village shaders
GL_EXT_disjoint_timer_query glGetInteger64v GL_TIMESTAMP failing with GL_INVALID_ENUM
Valgrind errors in VBO display list code since vertex store rework
Issue with Turnip compilation on Oneplus 8
freedreno: primtype_mask
[radv] bufferImageGranularity is 64
../mesa-9999/src/amd/llvm/ac_llvm_helper.cpp:63:14: error: ‘class llvm::AttributeList’ has no member named ‘hasAttribute’; did you mean ‘getAttributes’?
GPU Reset POLARIS with Unigine Heaven and X4
RADV: consistent crash in Splitgate
llvmpipe doesn’t compile a shader with an inner scope in a for loop
llvmpipe doesn’t compile the increment of a for a loop
Mesa 21.2.1 implementation error: unexpected state[0] in make_state_flags()
freedreno: regression in org.skia.skqp.SkQPRunner#gles_localmatriximagefilter
[Radeonsi] VA-API Encoding no longer works on AMD PITCAIRN
turnip: Geometry flickering in Genshin Impact after 83e9a7fbcf53b90d0de66985dbbf91986fc7b05d
i915g: Need to link fail on non-unrolled loops
spirv2dxil.c:128:22: error: passing argument 7 of ‘spirv_to_dxil’ from incompatible pointer type [-Werror=incompatible-pointer-types]
OSMesa problem resizing
iris: Perform busy tracking for resources without GEM_BUSY/GEM_WAIT
[RADV] The game “Aliens: Fireteam Elite” start crashing after commit 2e56e2342094e8ec90afa5265b1c43503f662939
radeonsi: Smart Access Memory not being enabled by default?
Memory leak: si_get_shader_binary_size is missing a call to ac_rtld_close
dEQP-GLES3.stress.draw.unaligned_data.random.4 segfault
gl_DrawID is incorrect for glMultiDrawElementsBaseVertex/glMultiDrawElementsIndirect
iris: Scanout buffers now mapped WB cause glitches on screen
turnip: dEQP-VK.spirv_assembly.instruction.graphics.spirv_ids_abuse.lots_ids_* fails
i915g: nir_to_tgsi: Error : CONST[0]: The same register declared more than once
i915: GPU hang when doing FB fetch and gl_FragDepth write in one shader
../mesa-9999/src/amd/compiler/aco_instruction_selection.cpp:10009:30: error: ‘exchange’ is not a member of ‘std’
radv: disable DCC for displayable images with storage on navi12/14
RADV: Menu static/artifacts in Doom Eternal
Crash happens when testing GL_PIXEL_PACK_BUFFER
Possible miscompilation of an integer division with vulkan
panfrost G31 - Cathedral crash- opengl 2.1 game (I guess)
freedreno C++14 build error
panfrost / armv7 - crash with mesa newer than 21.0.3
iris: recursive mutex acquire when re-using BO with aux map
llvmpipe doesn’t compile a valid shader with an useless switch
i915g: dEQP-GLES2.functional.fbo.completeness.renderable.texture.color0.rgb10_a2 failure
i915g: polygon offset CTS failures
GetFragDataLocation(prog, “gl_FragColor”) generates INVALID_OPERATION, but specs don’t say it should
anv: VK_EXT_memory_budget doesn’t know about device local memory
turnip: dEQP-VK.api.version_check.entry_points regression
Possible miscompilation of a comparison with unsigned zero
i915g: FXT1 support
dEQP-VK.wsi.android.swapchain.create#image_swapchain_create_info crash on Android R
Nine Regression with util: Switch the non-block formats to unpacking rgba rows instead of rects.
Add an Intel NDK Android build job
android: anv building error after commit e08370d
panfrost G31 Unreal Tournament - various glitches (apitrace)
Miscompilation of a switch case
ci/virgl: “dEQP error: waiting got error - 16, slow gpu or hang?” flakes
[radeonsi][regression] CPU is being used ~10 times more than usual after c5478f9067f.
i915g: cos/sin accuracy
glGetTexImage with PBO is not accelerated on Gallium
radeonsi: bad performance on PBO packs
dEQP-VK.wsi.android.swapchain.create#image_swapchain_create_info crash on Android R
[kbl] GPU hang launching UE4Editor (unreal engine)
turnip: A few dEQP-VK.pipeline.framebuffer_attachment.* tests failing due to “FINISHME: unaligned store of msaa attachment”
ci: new freedreno trace job running for lavapipe
i915g: Emit TXP
The image is distorted while use iGPU(Intel GPU) rendering and output via dGPU (AMD GPU)
Radeon 5700XT: Small render glitches around “heat balls” in dhewm3 (Doom 3)
lima: regression in plbu scissors cmd
freedreno: regression in org.skia.skqp.SkQPRunner#gles_multipicturedraw_*_tiled
Incorrect rendering
intel/isl: Wrong surface format name in batch
Unused graph areas created for device and format in VK_LAYER_MESA_overlay
[RADV] FSR in Resident Evil: Village looks very pixelated on Polaris
iris: regression in yuzu
21.2.0rc1 Build Failure - GCC6.3
Crash in update_buffers after closing KDE “splash screen” downloader
Firefox (wayland) crash in wayland_platform
Crash in update_buffers after closing KDE “splash screen” downloader
Firefox (wayland) crash in wayland_platform
radeonsi: persistent, read-only buffer maps are slow to read
substance painter flickering with jagged texture and masks shown black
radv: FP16 mode in FidelityFX FSR doesn’t look right
Regression, ACO: DOOM Eternal hangs with ACO
Regression in Turnip with KGSL and Zink running opengl in proot
[bsw][i965][bisected][regression] waffle crashing after patch
Validation crash on wlroots after wl_shm appeared
[RADV] Blocky corruption in Scarlet Nexus and vkd3d-proton 2.4
Changes¶
Adam Jackson (18):
glx/drisw: Nerf PutImage when loaderPrivate == NULL
mesa: (correctly) flush more in _mesa_make_current
egl/dri2: Stop disabling pbuffer support on msaa configs
dri: Reformat DRI context attribute #defines
glx: Fix and simplify the share context compatibility check
glx: Store the context vtable on the glx screen
glx/dri2: Require the driver to support v4 of __DRI_DRI2
glx/drisw: Remove some misplaced error checks
glx/dri: Collect the GLX context attributes in a struct
glx: Simplify context API profile computation
glx: Remove some unused declarations from glxclient.h
glx: Move __glFreeAttributeState next to its one caller
glx: Clarify a debug message
glx: Don’t strip off window/pixmap support from float fbconfigs
wsi/x11: Fix a misunderstanding about how xcb_get_geometry works
wsi/x11: Fetch and discard the SYNC extension info
dri: Remove the allow_fp16_configs option, always allow them
egl/dri: Enable FP16 for EGL_EXT_platform_device
Adrian Bunk (1):
util/format: NEON is not available with the soft-float ABI
Alejandro Piñeiro (12):
broadcom: don’t define internal BPP values twice
vulkan: add vk_spec_info_to_nir_spirv util method
spirv: set medium precision with RelaxedPrecision decorator
broadcom/qpu: update/remove comments
broadcom/qpu: add new lookup opcode description helper
broadcom/qpu: use and expand version info at opcode description
broadcom/compiler: remove commented out vir_LOAD_IMM methods
broadcom/compiler: remove qpu_acc helper
broadcom/common: remove unused debug helper
v3d/v3dv: add unlikely for any V3D_DEBUG check
v3dv: use NULL for vk_error on initialization failures
v3dv/pipeline: don’t clone the nir shader at pipeline_state_create_binning
Alyssa Rosenzweig (243):
panfrost: Add perf_debug macros
panfrost: Warn on software conditional rendering
panfrost: Warn on going out of AFBC
panfrost: Log reasons for flushes
panfrost: Warn on get_fresh_batch_for_fbo
panfrost: Warn on get_fresh_batch
panfrost: Warn on transitions to linear
pan/bi: Copy liveness routines back
pan/bi: Copy back add_successor
pan/bi: Copy back bi_foreach_successor
pan/bi: Copy block bi_block
pan/bi: Clean up useless casts
pan/bi: Clean up liveness freeing
pan/bi: Shrink live array to 8-bits
meson: Build panfrost with tools=panfrost
panfrost: Remove unnecessary bifrost_compiler deps
panfrost: Only build libpanfrost with GL/VK
pan/bi: Add explicit cast for lod_or_mode
pan/bi: Remove duplicate NIR compiler options
pan/bi: Mark mod to string as maybe unused
panfrost,panvk: Remove broken v4 spilling code
targets/graw-xlib: Add missing dep_x11
pan/mdg: Garbage collect silly quirk
panfrost: Move context initalization to the vtable
panfrost: Make sampler view creation private
panfrost: Move sysval analysis out of per-gen
panfrost: Compile pan_cmdstream per-gen
panfrost: Statically determine uses_clamp
panfrost: Don’t make get_index_buffer_bounded per-gen
panfrost: Match sampler “nearest” names
panfrost: Share sampler code across archs
panfrost: Share blend code across architectures
panfrost: #ifdef pan_merge_empty_fs
panfrost: #ifdef fragment RSD packing
panfrost: Add a concatenation macro for genxml
panfrost: Use PAN_ARCH for the rest of pan_cmdstream
panfrost: Move init_batch to GenXML vtbl
panfrost: Make panfrost_batch_get_bifrost_tiler per-gen
panvk: Fix sampler filter modes on Bifrost
asahi: Identify texture address field
asahi: Fix sampler filtering flag
asahi: Identify texture dimension field
asahi: Set texture dimension field
asahi: Calculate cube map stride
asahi: Calculate resource offsets for cube maps
asahi: Implement cube map tiling transfers
asahi: Use agx_rsrc_offset for linear transfer_map
asahi: Allow tiled cube maps
asahi: Simplify can_tile type signature
asahi: Require tiling for cube maps
asahi: Assert texture layer is nonzero
agx: Don’t set helper invocation kill bit
agx: Fix mismatched units in load_ubo
agx: Dump register file when failing to allocate
agx: Use consistent ncomps
agx: Plug memory leak in register allocator
asahi: Enable instancing
agx: Drop dated /* TODO: RA */
agx: Handle load_instance_id
agx: Add agx_ushr helper
agx: Add udiv-by-constant routine
agx: Include divisors in the vertex shader key
agx: Implement instanced arrays
agx: Define p_extract for type converts
asahi: Pass instance_divisor to the compiler
agx: Add agx_format_shift routine
agx: Shift vertex buffer stride in the compiler
asahi: Add integers to agx_vertex_formats
asahi: Generalize src_offset for non-4byte formats
pan/va: Add initial ISA.xml for Valhall
pan/va: Add ISA.xml parser and support code
pan/va: Assert no instructions are duplicated
pan/va: Add Valhall assembler
pan/va: Check for FAU conflicts in the assembler
pan/va: Add disassembler generator
pan/va: Add dis/assembler test cases
pan/va: Add negative test cases for the assembler
pan/va: Add assembler test harness
pan/va: Add disassembler test harness
pan/va: Integrate the tests into meson test
pan/bi: Remove unused pointer from bi_instr
pan/bi: Remove unused option
pan/bi: Parse file names in standalone compiler
pan/bi: Zero initialize shader_info
pan/bi: Do more mesa/st stuff in standalone compiler
pan/bi: Add quirks for Mali G78
pan/bi: Only call clause code on Bifrost
pan/bi: Output binaries from standalone compiler
pan/bi: Add helpers for unit testing
pan/bi: Add instruction equality helper
pan/bi: Add instruction unit test macro
pan/bi: Remove redundant check in clamp fusing
pan/bi: Constify BIR manipulation
pan/bi: DCE after bifrost_nir_lower_algebraic_late
pan/bi: Add discard flag to bi_index
pan/bi: Remove unused BIR_FAU_HI
pan/bi: Model *ADD_IMM instructions in IR
pan/bi: Model RSCALE for Valhall
pan/bi: Model Valhall special values as FAU
pan/bi: Fix typo in FAU enum
pan/bi: Rename NOP.i32 to NOP
pan/bi: Rename CLPER_V7 back to CLPER
pan/bi: Add strip_index helper
pan/bi: Add helper to swizzle a constant
pan/bi: Use bi_apply_swizzle in constant folding
pan/bi: Refactor constant folding for testability
pan/bi: Add constant folding unit test
pan/bi: Fix UBO push with nir_opt_shrink_vectors
pan/bi: Garbage collect stuff in bi_layout.c
pan/bi: Add branch_offset immediate
pan/bi: Clean up and export bi_reconverge_branches
pan/bi: Clarify the logic of bi_reconverge_branches
pan/bi: Align staging registers on Valhall
pan/va: Allow floating-point swizzles on ATEST
gallium/tests: Fix warning calculating absdiff
pan/bi: Inline away bi_must_last
pan/bi: Remove dated ASSERTED properties
pan/bi: Expose unit tested scheduler predicates
pan/bi: Add BIT_ASSERT helper for unit testing
pan/bi: Teach meson about scheduler predicate test
pan/bi: Teach meson about Bifrost packing test
pan/bi: Teach meson about format pack tests
glsl/standalone: Lower COMPUTE shader precision
pan/bi: Restrict swizzles on same cycle temporaries
pan/bi: Test restrictions on same-cycle temporaries
pan/bi: Remove incorrect errata workaround
pan/bi: Use getopt for bifrost_compiler
pan/bi: Lower fragment output with <4 components
pan/bi: Add bi_entry_block helper
pan/bi: Handle asymmetric staging in bi_count_read_registers
pan/bi: Stub 64-bit in count_write_registers
pan/bi: Validate the live set starts empty
nir/lower_mediump_io: Don’t remap base unless needed
nir/lower_mediump: Fix metadata in all passes
pan/bi: Make bi_opt_push_ubo optional
pan/bi: Add a noopt debug option
panfrost: Add LINEAR debug option
panfrost: Remove unused #defines
panfrost: Use _PU for non-dithered formats
panfrost: Add blend helper packing the equation
panfrost: Fix is_opaque when blend_enable=false
panfrost: Simplify blend_factor_constant_mask
panfrost: Add basic fixed-function blending tests
panfrost: Leverage Bifrost’s 2*src blend factor
panfrost: Test src*dst + dst*src blending
pan/va: Document IEEE 754 conformance of clamps
pan/bi: Constant fold texturing lowerings
pan/bi: Unit test new constant folding patterns
pan/bi: Simplify bi_compose_clamp
pan/bi: Fuse abs/neg more on Valhall
pan/bi: Add shader equality helper for unit tests
pan/bi: Use FABSNEG pseudo ops for modifier prop
pan/bi: Add optimizer unit tests
pan/bi: Use FCLAMP pseudo op for clamp prop
pan/bi: Add fclamp unit tests
pan/bi: Fuse DISCARD with conditions
pan/bi: Unit test DISCARD+FCMP fusing
docs/panfrost: Update llvm option
drm-shim: Support kernels with >4k pages
panfrost: Fix leak of render node fd
panfrost: Rewrite the clear colour packing code
panvk: Use pan_pack_color
panfrost: Mark R5G6B5 as blendable
panfrost: Unit test clear colour packing
panfrost: Add dither state to the clear colour tests
panfrost: Handle non-dithered clear colours
panfrost: Add unit tests for non-dithered clears
panfrost: Disable shader-assisted indirect draws
pan/bi: Set eldest_colour dependency for ST_TILE
pan/bi: Don’t set td in blend shaders
pan/bi: Correct the sr_count on +ST_TILE
pan/bi: Extract load_sample_id to a helper
pan/bi: Set the sample ID for blend shader LD_TILE
panfrost: Evaluate blend shaders per-sample
pan/bi: Use ST_TILE for multisampled blend output
pan/bi: Use CLPER_V6 on Mali G31
panfrost: Remove unneeded quirks from T760
panfrost: Fix UNORM 10 sizes
panfrost: Use blendable check for tib read check
panfrost: Delete unpacks for blendable formats
pan/mdg: Insert moves before writeout when needed
pan/lower_framebuffer: Don’t replicate so much
pan/lower_framebuffer: Use fmul_imm
pan/lower_framebuffer: Unify UNORM handling
pan/lower_framebuffer: Don’t treat UNORM 4 special
pan/lower_framebuffer: Don’t open-code pad_vec4
pan/lower_framebuffer: Don’t open-code pan_unpacked_type_for_format
pan/mdg: Handle swapped 565 and 1010102 unorm
panfrost: Zero initialize blend_shaders
panfrost: Port v5 blend shader issue to blitter
panfrost: Fix NULL dereference in allowlist code
panfrost: Rip out primconvert code
panfrost/ci: Switch to suite support
panfrost/ci: Don’t skip matrix inverse tests
panfrost: Protect the variants array with a lock
panfrost: Remove null check in batch_cleanup
panfrost: Simplify get_fresh_batch_for_fbo
panfrost: Don’t use ralloc for resources
panfrost: Move bo->label assignment into the lock
panfrost: Remove get_fresh_batch
panfrost: Inline add_fbo_bos
panfrost: Switch resources from an array to a set
panfrost: Cache number of users of a resource
panfrost: Maintain a bitmap of active batches
panfrost: Add foreach_batch iterator
panfrost: Prefer batch->resources to rsrc->users
panfrost: Remove rsrc->track.users
panfrost: Remove writer = NULL assignments
panfrost: Replace writers pointer with hash table
panfrost: Take a ctx when submitting/destroying
panfrost: Raise maximum texture size
panfrost: Remove CACHE_LINE_SIZE #define
panfrost: Remove stale TODOs and XXXs
panfrost: Remove unused functions
pan/bi: Simplify condition
pan/bi: Assert l != NULL in bi_ra
pan/bi: Remove unused clause_start field
pan/bi: Fix format specifiers in disassembler
docs/panfrost: Remove obsolete note on Android.mk
docs/panfrost: We’re conformant now!
docs/panfrost: Add web chat link
panfrost: Fix incorrect test condition
panfrost: Add ASTC stretch factor enums
panfrost: Assert ASTC/AFBC are not used on v4
panfrost: Use ASTC 2D enums
panfrost: Encode 3D ASTC dimensions
panfrost: Move special_varying to compiler definitions
panfrost: Fix off-by-one in varying count assert
panfrost: Introduce PAN_MAX_VARYINGS define
panfrost: Don’t set CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER
panfrost: Fix PAN_MESA_DEBUG=sync with INTEL_blackhole_render
nir: Add Mali-specific derivative opcodes
pan/bi: Optimize abs(derivative)
panfrost: Don’t allow rendering/texturing 48-bit
panfrost: Detect implementations support AFBC
panfrost,panvk: Use dev->has_afbc instead of quirks
panfrost: Fix gl_FragColor lowering
panfrost: Workaround ISSUE_TSIX_2033
panfrost: Add internal afbc_formats
panfrost: Decompress for incompatible AFBC formats
panfrost: Enable AFBC on v7
mesa: Require MRT support for GL3/ES3
nir/lower_pntc_ytransform: Support PointCoordIsSysval
Andreas Baierl (5):
lima: CI: Enable GL_R8 and GL_RG8 texture formats
lima: Expose GL_EXT_clip_control
lima: Remove depth near/far workaround
lima: Fix glFrontFace handling
lima/parser: add shader disassembly to dump
Andreas Bergmeier (1):
v3dv: implement VK_EXT_physical_device_drm
Antonio Caggiano (3):
ci/freedreno: Test with non-redistributable traces
freedreno/ci: Add a manual job for tracking performance
pps: Restore documentation
Anuj Phogat (1):
intel/dg2: Add L3 configuration
Arvind Yadav (1):
radeonsi: remove the use of PKT3_CONTEXT_REG_RMW
Axel Davy (1):
util: Fix translate from block compressed to rgba
Bas Nieuwenhuizen (72):
zink: set dedicated allocation when needed
util/fossilize_db: Update parsed_offset correctly.
util/fossilize_db: Reset file position to parsed_offset on cache_offset read failure.
util/fossilize_db: Flush files after header write.
util/fossilize_db: Be conservative about header length check for locking.
util/fossilize_db: Only allocate entries after full read.
util/fossilize_db: Use uint64_t for file size.
util/fossilize_db: Unlock the cache file if the entry already exists.
util/fossilize_db: Add extra flock mutex.
radv: Use correct signedness in misalign test.
radv: Allocate space for inline push constants.
nir/lower_scratch: Ensure we don’t lower vars with unsupported usage.
nir/inline_functions: Handle halting functions.
radv: Check format before calling depth_only/stencil_only.
util/fossilize_db: Don’t corrupt keys during entry read.
nir: Avoid visiting instructions multiple times in nir_instr_free_and_dce.
radv: Expose a bufferImageGranularity of 1.
radv: Fix CPU AABB build.
radv: Fix arrayOfPointers for instances in accel struct build.
radv: Add accel struct build support for the object-to-world matrix.
radv: Add more acceleration structure formats.
radv: Add optimized CPU BVH builds.
radv: Add bvh node definitions to a header.
radv: Modify load_sbt_amd intrinsic to get the descriptor.
aco: Implement call scope.
radv: Refactor some nir_channels usage to use nir_channel.
radv: Do more meta shader lowering.
radv: Implement NULL accel struct descriptor write.
nir: Add AMD rt intrinsics.
radv: Add support for ray launch size.
aco: Add support for ray launch size.
nir: Support ray launch size in divergence analysis.
radv: Support nir_intrinsic_load_global_constant.
radv: Add RT cache flushes.
radv: Add pipeline type.
radv: Add group info to pipeline.
radv: Add raytracing pipeline properties.
radv: Make some pipeline functions non-static.
radv: Add scaffolding for RT pipeline compilation incl libraries.
radv: Add main loop variables.
radv: Add helper to inline shaders into the main shader.
radv: Add helper to parse raytracing stages.
radv: Add pass to lower anyhit shader into an intersection shader.
radv: Add ray traversal loop.
radv: Combine all the parts together with a main loop for an RT pipeline.
radv: Add support for setting a dynamic stack size.
radv: Add caching for RT pipelines.
radv: Experimentally enable RT extensions.
radv: Add DMA buffer update function for internal use.
radv: Add an internal indirect dispatch command.
radv: Add an indirect dispatch struct to the header.
radv: Add copy/serialization info to accel struct headers.
radv: Add acceleration structure queries.
radv: Add GPU copy/serialization/deserialization shader.
radv: Add CPU copying of acceleration structures.
radv: Add GPU copying of acceleration structures.
radv: Add CPU serialization of acceleration structures.
radv: Add GPU serialization of acceleration structures.
radv: Fix Android build for common functions.
radv: Don’t invalidate VCACHE after clear_htile_mask.
radv: Add VK_FORMAT_R16G16B16A16_UNORM for accel. structures.
radv: Handle copying zero queries.
amd/common: Add fallback for misreported clocks for RGP.
radv: Document cache coherency rules.
radv: Add hooks after in-renderpass meta operations.
radv: Try to do a better job of dealing with L2 coherent images.
radv: Fix modifier property query.
radv: Add bufferDeviceAddressMultiDevice support.
radv: Disable coherent L2 optimization on cards with noncoherent L2.
meson: Check arguments before adding.
util: Add support for clang::fallthrough.
radv: Fix memory corruption loading RT pipeline cache entries.
Boris Brezillon (137):
panfrost: Fix pan_blitter_emit_bifrost_blend()
panfrost: Add explicit padding to pan_blend_shader_key
pan/gen_pack: Generalize the PREFIX() trick
panvk: Add missing midgard_pack dependency
pan/gen_pack: Add pan_size() and pan_align() macros
panfrost: Move the polygon list init logic to pan_cmdstream.c
pan/gen_macros: Move the TEXTURE definition to gen_macros.h
pan/gen_macros: Map {TEXTURE,SAMPLER} to the arch-specific descriptor
pan/gen_macros: Include midgard_pack.h from gen_macros.h
panfrost: Stop including midgard_pack.h directly
panfrost: s/[idep_]midgard_pack/[idep_]pan_packers/
panfrost: Get rid of the mali_xxx enum redefinitions
panfrost: Add generic mappings for the gen-specific tiler descriptor macros
pan/gen_pack: Add parens around packed1/2 vars in pan_merge()
panfrost: Get rid of all _packed structs in pan_context.h
panfrost: Move panfrost_modifier_to_layout() to pan_texture.c
panfrost: Only emit special attribute buffer entries on pre-v6 hardware
panvk: Prepare per-gen split
panfrost: Prepare indirect dispatch helpers to per-gen XML
panfrost: Prepare indirect draw helpers to per-gen XML
panfrost: Fix pan_blit_ctx_init() when start > end
panfrost: Make pan_blit() return the tiler job pointer
panfrost: v7 does not support RGB32_UNORM textures
panvk: Make the per-arch static lib depend on panvk_entrypoints.h
panvk: Fix panvk_copy_fb_desc()
panvk: Don’t use pan_is_bifrost()
panvk: Fix blend descriptor emission
panvk: Only advertise MSAA-4
panvk: We don’t support linear filtering on integer formats
panvk: Don’t advertise min/max filter
panvk: Fix chan_size calculation in panvk_emit_blend()
panvk: Narrow the allow-forward-pixel-kill condition
panvk: Clamp blend constants before copying them to the cmdbuf state
panvk: Don’t allocate an array of blend constants
panvk: Close the panfrost device in the panvk_physical_device_init() error path
panvk: Reset panvk_pool->transient_bo in panvk_pool_reset()
panvk: Fix a BO leak in panvk_pool_alloc_backing()
panvk: Initialize clear values to zero when load_op != OP_CLEAR
panvk: Don’t take a BO reference when binding memory to an image
panvk: Only set PAN_DBG_TRACE if PANVK_DEBUG_TRACE is set
panvk: Disable the BO cache
panfrost: Patch Z32_S8X24 format when creating a sampler view
panfrost: Fix the Z32_S8X24 and X32_S8X24 definitions
panfrost: RGB10_A2_SNORM is not a valid texture format on v6+
panfrost: Drop the R and T flags on SCALED formats
panfrost: RGB332_UNORM is not a valid texture format on v6+
panfrost: Prepare blitter helpers to per-gen XML
panfrost: Prepare blend helpers to per-gen XML
panfrost: Prepare pan_cs helpers to per-gen XML
panfrost: Move panfrost_major_version() to gen_macros.h
panfrost: Prepare pandecode to per-gen XML
panfrost: Prepare scoreboard helpers to per-gen XML
panfrost: Prepare pan_encoder.h to per-gen XML
panfrost: Prepare texture helpers to per-gen XML
panfrost: Prepare shader helpers to per-gen XML
panfrost: Fix indirect draws when vertex or instance count is 0
panfrost: Fix collision in the indirect draw shader table
panfrost/ci: Skip the indirect_draw+XFB tests
pan/bi: Relax check on 8bit swizzles
pan/bi: Allow passing RT conversion descriptors to fragment shaders
pan/blit: Fix a NULL dereference in the preload path
pan/blit: Extend pan_preload_fb() to return emitted jobs
panvk: Initialize the blend shader logic
panvk: Preload FB attachments when required
panvk: Merge identical BO entries before submitting a job
panvk: Move copy stubs to a separate file
panvk: Move blit/resolve stubs to a separate file
panvk: Get rid of panvk_emit_fragment_job()
panvk: Don’t use the subpass to calculate the FB descriptor size
panvk: Don’t check the bind_point in panvk_cmd_prepare_fragment_job()
panvk: Make panvk_cmd_alloc_tls_desc() more generic
panvk: Add a panvk_cmd_prepare_tiler_context() helper
panvk: Stop dereferencing the subpass in panvk_cmd_close_batch()
panvk: Issue a fragment job if at least one target is cleared
panvk: Implement vkCmdClear{DepthStencil,Color}Image()
panvk: Implement vkCmdCopyImage()
panvk: Implement vkCmdCopyBufferToImage()
panvk: Implement vkCmdCopyImageToBuffer()
panvk: Implement vkCmdCopyBuffer()
panvk: Implement vkCmdFillBuffer()
panvk: Implement vkCmdUpdateBuffer()
pan/decode: Fix DCD size in Pre frame decoding
pan/blit: Let the caller offset the start/end coords passed to the blitter
pan/blit: Fix 3D blittering
panvk: Implement vkCmdBlitImage()
panvk: Always allocate at least one BLEND descriptor for fragment shaders
panvk: Fix the static scissor/viewport case
panvk: Fix TLS initialization for multi-draw batches
panvk: Extend panvk_cmd_close_batch() to handle current_batch == NULL
panvk: Make panvk_cmd_open_batch() return the new batch
panvk: Use the local batch variable when we have one
panvk: Don’t invalidate the vertex attributes when binding a new pipeline
panvk: Fix the pipeline binding logic
panvk: Fix panvk_pipeline_builder_upload_sysval()
panvk: Fix multisample image copies
panvk: Avoid allocating sysvals UBOs when the pipeline has one
panvk: Handle input varyings without previous writes
panvk: Fix an overflow on cmdbuf->state.clear
panvk: Don’t expect subpasses to use all RTs
panvk: Only prepare texture descriptors when the image is sampled
panvk: Fix 1DArray image to buffer copy
panvk: Fix size overflow in GetBufferMemoryRequirements()
panvk: Fix stencil clear assignment in panvk_cmd_fb_info_set_subpass()
panvk: Handle VK_REMAINING_{MIP_LEVELS,ARRAY_LAYERS) when creating image views
panvk: Split var copies before lowering them
panvk/ci: Trigger bifrost jobs on vulkan changes
pan/bi: Fix 1DArray image coordinate retrieval
pan/lower_fb: Support SNORM8 unpacking
pan/lower_fb: Re-order components when dealing with raw formats
pan/lower_fb: Add support for B10G10R10A2_UINT variants
pan/lower_fb: Add support for rgb10a2 _SINT variants
panfrost: Use an identity swizzle for RAW formats
panfrost: Add a common genxml file so we can share a few definitions
panfrost: Split command stream descriptor definitions per-gen
panfrost: Move genxml related files to a subdir
nir: Make sure src->num_components < dst->num_components in nir_ssa_for_src()
nir/lower_blend: Pad src to a 4-component vector
nir/lower_blend: Don’t lower RTs whose format is set to NONE
nir/lower_blend: Make sure we’re not passed scaled formats
nir/lower_blend: Shrink blended result if needed
pan/blend: Allow passing blend constants through a sysval
panvk: Fill the blend constants sysval
panvk: Lower blend operations when needed
panvk/ci: Enable blend tests
panvk: Fix allocation of BOs bigger than the slab size
panvk: Don’t use panfrost_get_default_swizzle() on v7+
panvk: Fix wls_size retrieval
panvk: Pass the render target index to panvk_meta_clear_attachment()
panvk: Allow clear_attachment of RTs > 0
panvk: Support clearing ZS attachments
nir: Add a nir_sysvals_to_varyings() helper
spirv: Let spirv_to_nir() users turn sysvals into input varyings
spirv: Always declare FragCoord as a sysval
spirv: Declare PointCoord as a sysval
vulkan: Fix weak symbol emulation when compiling with MSVC
vulkan: Set unused entrypoints to vk_entrypoint_stub when compiling with MSVC
vulkan: Fix entrypoint generation when compiling for x86 with MSVC
Boyuan Zhang (5):
radeon/vcn: initilize num_temporal_layers for hevc
radeon/vcn: track width and height of the last frame
radeon/vcn: check frame size change for vp9 header flags
radeon/vcn: set min value for num_temporal_layers
frontends/va: add num_temporal_layers check
Caio Marcelo de Oliveira Filho (27):
vulkan/util: Add and use vk_multialloc_zalloc variants
anv: Zero initialize pipeline structs
spirv: Implement SPV_EXT_shader_atomic_float16_add
vulkan: Update XML and headers to 1.2.185
anv: Advertise support for VK_EXT_shader_atomic_float2
nir/dead_cf: Do not remove loops with loads that can’t be reordered
nir: Update documentation for location to mention Task/Mesh
nir: Add a way to identify per-primitive variables
nir: Add per-primitive I/O intrinsics
compiler: Add new non-Multiview Task/Mesh builtins
compiler: Add Task/Mesh to shader_info
nir/lower_io: Identify Mesh output as arrayed
nir/divergence_analysis: Handle Task/Mesh shaders
nir: Don’t lower Task/Mesh I/O to temporaries
nir: Allow Task/Mesh to lower compute system values
spirv: Implement non-Multiview parts of SPV_NV_mesh_shader
anv: Simplify subgroup_size_type rules for compute shaders
anv: Refactor subgroup_size_type rules into a single function
spirv: Identify non-temporal memory access
nir/lower_io_to_vector: Allow Task/Mesh to load from outputs
intel: Add and use max_constant_urb_size_kb
iris: Document push constants allocation
anv: Validate vertex related states only when VS is present
anv: Move together primitive pipeline emit calls
anv: Identify code paths specific to graphics primitive pipeline
intel/compiler: Convert test_eu_compact to use gtest
intel/compiler: Remove unused `ret` declaration
Caio Oliveira (1):
util/ra: Fix deserialization of register sets
Carsten Haitzler (1):
panfrost: tidy up GPU naming to be in line with official names
Charlie Turner (5):
ci: Build libdrm earlier for x86_test-vk
ci: Fix syntax error in radv fails files
ci: Support per-driver skip lists.
radv/ci: Remove duplication in dEQP skip lists.
radv/ci: Fix the GPU_VERSION for polaris10
Charmaine Lee (2):
aux/draw: Check for preferred IR to take nir-to-tgsi path in draw module
svga: fix render target views leak
Chia-I Wu (43):
venus: refactor vn_EndCommandBuffer
egl/surfaceless: try kms_swrast before swrast
meson: allow egl_native_platform to be specified
vulkan/wsi: replace prime_blit_buffer by a bool
venus: clean up vn_AllocateMemory
venus: suballocate memory in more cases
venus: log more WSI messages
vulkan/wsi/x11: do not inherit last_present_mode
venus: print warnings when stuck in busy waits
iris, crocus: add idep_genxml to per_hw target dependencies
venus: update venus-protocol headers
venus: break up vn_device.h
venus: break up vn_device.c
venus: free queues after vkDestroyDevice is emitted
venus: use uint32_t in vn_ring_submit
venus: minor cleanup to physical device init loop
venus: pre-initialize device groups
venus: fix device group enumeration with unsupported devices
venus: group physical device fields with a struct
venus: no supported device is not an error
venus: initialize physical devices once
venus: reorder version fields in vn_instance
venus: init roundtrip fields in vn_instance later
venus: add vn_renderer_submit_simple_sync
venus: support reply shmem without ring
venus: init experimental features before the ring
venus: add and use VN_CS_ENCODER_INITIALIZER
venus: rework vn_instance_submission
venus: make ring buffer size configurable
venus: update venus-protocol headers
venus: raise the ring buffer size to 64KB
venus: refactor vn_instance_enumerate_physical_devices
venus: separate physical device init and filter
venus: copy VkPhysicalDeviceImageDrmFormatModifierInfoEXT
venus: add vn_refcount
venus: convert bo and shmem to use vn_refcount
venus: add a helper to destroy vn_descriptor_set
venus: add vn_refcount to vn_descriptor_set_layout
venus: keep layouts of descriptor sets alive
radv: plug leaks in radv_device_init_accel_struct_build_state
vulkan/wsi/wayland: fix an invalid u_vector_init call
util/vector: make util_vector_init harder to misuse
venus: add atrace support
Christian Gmeiner (46):
etnaviv: export supported prim types
etnaviv: remove primconvert
ci: include etnaviv support in ARMHF container.
ci: update kernel
ci/bare-metal: add telnet based serial
ci/bare-metal: add support for eth008 power relay
ci/bare-metal: add etnaviv
lima: fix leak of the screen hash table
util/tests: rename bitset test names
util/bitset: add bitwise AND, OR and NOT
util/tests: add bitwise AND, OR and NOT tests
util/bitset: add right shift
util/tests: add bitset SHR tests
util/bitset: add left shift
util/tests: add bitset SHL tests
util/bitset: s/BITSET_SET_RANGE/BITSET_SET_RANGE_INSIDE_WORD
util/bitset: add BITSET_SET_RANGE(..)
util/tests: add set bit range test
freedreno/isa: add leading zero’s
freedreno/isa: simplify custom_target
freedreno/isa: add next_instruction(..)
freedreno/isa: add defines for fprintf(..) usage
freedreno/isa: store max size for needed bitset
freedreno/isa: generate ir3-isa.h
freedreno/isa: generate isaspec-decode.h
freedreno/isa: add bitmask_t to encode.py
freedreno/isa: add bitmask to/from uint64_t helper
freedreno/isa: add BITMASK_WORDS define
freedreno/isa: add store_instruction(..)
freedreno/isa: generate marcos used for printf(..)
freedreno/isa: add split_bits(..) methods
freedreno/isa: decode: switch bitmask_t to BITSET_WORD’s
freedreno/isa: encode: switch bitmask_t to BITSET_WORD’s
freedreno/isa: update documentation
freedreno/isa: add shbang and make executable
freedreno/isa: move isaspec to a new home
compiler/isaspec: add print(..) helper
compiler/isaspec: keep track of written data
compiler/isaspec: add alignment support
etnaviv: use better name for fd hash table
etnaviv: fix leak of the screen hash table
etnaviv: fix indentation
etnaviv: move drm version readout to drm layer
etnaviv: allow screen creation with NULL renderonly object
etnaviv: extend screen_create(..) with gpu_fd
etnaviv: add etna_lookup_or_create_screen(..)
Clayton Craft (1):
anv: don’t advertise vk conformance on GPUs that aren’t conformant
Connor Abbott (81):
tu: Triage some CTS failures
ir3: Preserve gl_ViewportIndex in the binning shader
tu: Use NIR for clear/blit shaders
ir3: Delete old packed struct encoding
tu: Handle multisample vkCmdCopyColorImage()
tu: Make tile stores use a dedicated CS
tu: Implement non-aligned multisample GMEM STORE_OP_STORE
freedreno: Rename and document tess primid-related sysvals
tu, freedreno/a6xx, ir3: Rewrite tess PrimID handling
tu, freedreno/a6xx: Fix setting PC_XS_OUT_CNTL::PRIMITVE_ID
ir3: Document RA-related register flags better
tu: Read some input attachments directly
freedreno/a6xx: Add new register fields
freedreno, tu: Stop asking for foveation quality
freedreno, tu: Set GRAS_LRZ_PS_INPUT_CNTL::SAMPLEID
freedreno/a6xx: Document GRAS_SC_CNTL::SINGLE_PRIM_MODE
tu: Fix feedback loops in sysmem mode
tu: Fix xfb when there is a hole at the end
freedreno: Decode a650+ CP_START_BIN/CP_END_BIN packets
tu: Fix logic errors with subpass implicit dependencies
tu: Consider depth/stencil for implicit dependencies
ir3: Add pass to remove unreachable blocks
ir3/ra: Remove logical_unreachable
ir3: Copy-propagate single-source phis
ir3: Print physical successors/predecessors
ir3/print: Use mesa_stream_log_printf for (kill)
ir3/merge_regs: Set wrmask for pcopy destinations
ir3/ra: Reinitialize interval when inserting
ir3/ra: Fix available bitset for live-through collect srcs
ir3/ra: Handle huge merge sets
ir3/ra: Make ir3_reg_interval_remove_all() useful for spilling
ir3: Add loop depth to ir3_block
ir3: Add ra_foreach_src_n/ra_foreach_dst_n
ir3: Fix RA debug printing
ir3: Properly validate pcopy reg sizes
ir3: Fix compress_regs_left accounting for half-regs
ir3: Initial support for spilling non-shared registers
ir3: Fix getting stp/ldp components in ir3_info
ir3, turnip, freedreno: Report stp/ldp in shader stats
freedreno/ci: Add spillall tests
tu: Properly handle waiting on an earlier pipeline stage
tu: Add a650-specific CCU flush workaround
tu: Remove some stale bypass xfails
ir3: Remove ir3_instr::name
ir3: Make instruction IP 32 bits
ir3: Make ir3_register::name 32-bits
ir3/ra: Fix type mismatch when comparing intervals
lima: Add a NIR load duplicating pass
lima/gpir: Rewrite register allocation for value registers
freedreno/computerator: Add support for pvtmem
ir3/lower_pcopy: Use right flags for src const/immed
ir3/lower_pcopy: Set entry->done in the swap loop
tu: Fix VS primid with tess + GS
freedreno/a6xx: Fix VS primid with tess + GS.
ir3: Add bar to beginning of HS with tess_use_shared
freedreno, turnip: Disable 8bpp UBWC on a650
ir3: Make trig replacement expression exact
freedreno/a6xx: Name TPL1_DBG_ECO_CNTL
freedreno, turnip: Set TPL1_DBG_ECO_CNTL better
ir3: Use source in ir3_output_conv_src_type()
tu/clear_blit: Constify some image views
tu: Implement VK_KHR_imageless_framebuffer
ir3/lower_subgroups: Support 16-bit READ_* sources
ir3: Skip src size validation for cat1
tu: Expose VK_KHR_shader_subgroup_extended_types
ir3: Initialize local size earlier
ir3/ra: Don’t reset round-robin start for each block
ir3/ra: Use killed sources in register eviction
ir3/cp: Add missing const promotion check
ir3/cp: Fix inlining 32->16 const into meta instructions
nir/lower_ubo_vec4: Fix align_mul=8 special case
ir3: Fix printing branch type
ir3: Make ir3_create_collect() take a block
ir3: Always create barycentrics in the input block
ir3: Remove separate regmask.h
ir3: Handle special regs in regmask
ir3/legalize: handle WAR for special regs
ir3: Fix check for immediate range
ir3: Fix handling cat6 immediates
ir3: Fold ldc src immediates
ir3/spill: Mark root as non-spillable after inserting
Corentin Noël (8):
ci: actually run piglit tests with virgl
ci: Re-enable piglit trace for virgl
ci: Disable llvmpipe optimizations when running virgl CI
ci: Increase the default Rust toolchain version
ci: Increase crosvm version
ci: Use crosvm to run dEQP tests for virgl
glx: Prevent crashes when an extension isn’t found
virgl: Set GL_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION to 1
Daniel Schürmann (54):
aco/optimizer: ensure to not erase high bits when propagating packed constants
aco/ra: don’t allocate vector space for MIMG NSA operands
aco: include <cstddef> in aco_util.h
nir/lower_alu_to_scalar: don’t skip gaps in write_mask
nir/opt_shrink_vectors: don’t shrink vectors used by intrinsics
nir: consider write_mask in nir_ssa_def_components_read()
nir/opt_shrink_vectors: reverse iteration order
nir/shrink_vectors: shrink ALU properly
nir/shrink_vectors: shrink vecN properly
nir: return false for loops in contains_other_jump()
aco/print_ir: fix printing of VOPC_SDWA definitions
aco: use VOPC_SDWA on GFX9+
aco: add instr_is_16bit() helper function
aco/ra: refactor subdword definition info
aco/ra: refactor subdword operand stride
aco/validate: simplify get_subdword_bytes_written()
aco/opcodes: remove definition_size[]
aco: add more validation rules for SDWA operands
nir/loop_analyze: consider instruction cost of nir_op_flrp
nir/opt_algebraic: optimize flrp(fadd, fadd, x) only if fadd are used_once
radv: call nir_lower_flrp() after the first radv_optimize_nir()
aco: remove redundant s_and exec after nir_op_inot
aco: only apply extract if not used more than 4 times
aco: refactor nir_op_imul selection
aco/optimizer: combine v_mul_lo_u16 + v_add_u16 -> v_mad_u16
aco/optimizer: fuse v_mul_f64 + v_add_f64 -> v_fma_f64
aco/optimizer: combine v_pk_mul_u16 + v_pk_add_u16 -> v_pk_mad_u16
aco: fix init_any_pred_defined() for loop header phis
aco: refactor lower_phis()
aco/lower_bool_phis: avoid creating trivial phis
aco/lower_phis: propagate constants before emitting merge code
aco/lower_phis: optimize loop exit phis
aco: fix p_insert lowering with 16bit sources
aco: rewrite SDWA selector
aco: remove explicit dst_preserve flag
aco/print_ir: always print SDWA dst & src selections
aco: preserve subdword RC when lowering p_insert/p_extract
aco/ra: Fix potential out-of-bounds array accesses.
aco/ra: don’t copy linear VGPRs within CF in get_reg_create_vector()
aco: stop scheduling if clause-forming fails
aco: make clause-forming depend on the number of moved instructions
aco: try forming clauses even if reg_pressure exceeds
aco: clang-format
aco/ra: fix intersects()
aco/ra: refactor affinities into assignment struct
aco/ra: remove some redundant code
aco/ra: split register assignment for phis into separate function
aco/ra: try more aggressive to assign phi defs the same register
aco/ra: for phis try to find an operand-matching register earlier
aco/ra: don’t set affinities for ssa-repair phis
aco/ra: create affinities between nested phis
aco/ra: create nested affinities for loop header phis
aco/ra: don’t rewrite affinities for phi operands after register assignment
driconf: set vk_x11_strict_image_count for Wolfenstein: Youngblood
Daniel Stone (7):
vulkan/wsi/wayland: Cosmetic alignment fix
vulkan/wsi/wayland: Initialise wl_shm pointer in VkImage
egl/wayland: Error on invalid native window
egl/wayland: Allow EGLSurface to outlive wl_egl_window
CI: Disable LAVA devices
Revert “CI: Disable LAVA devices”
fdno/resource: Rewrite layout selection for allocation
Danylo Piliaiev (39):
freedreno: fix wrong tile aligment for 3 CCU gpu
tu: handle half-reg fs outputs
tu: delay decision of forcing sysmem due to subpass self-dependencies
turnip: reduce maxComputeWorkGroupSize
tu: disable gmem in primary cmdbuffer if secondary has it disabled
tu: add “flushall” and “syncdraw” debug options
freedreno/decode: print estimated crash location without colored output
tu: declare VK_EXT_extended_dynamic_state2 but leave it disabled
tu: implement dynamic depth bias enable
tu: implement dynamic primitive restart enable
tu: implement dynamic rasterizer discard enable
tu: enable VK_EXT_extended_dynamic_state2
turnip: provide dummy CmdSetLogicOpEXT and CmdSetPatchControlPointsEXT
freedreno: rename Z_TEST_ENABLE->Z_READ_ENABLE, Z_ENABLE->Z_TEST_ENABLE
turnip: apply workaround for depth bounds test without depth test
ir3: prohibit folding of half->full conversion into mul.s24/u24
ir3/a6xx,freedreno: account for resinfo return size dependency on IBO_0_FMT
turnip: consider shader’s immediates size for sub-stream allocation
turnip: re-emit vertex params after they are invalidated
util/u_trace: make u_trace usable for other than gallium drivers
util/u_trace: auto-generation of serialization funcs for tracepoints
turnip: implement basic perfetto support
u_trace: helpers for tracing tiling GPUs and re-usable VK cmdbuffers
turnip/perfetto: reusable command buffers support
u_trace: pass command stream through tracing functions
turnip: support tracing of gmem/sysmem load/store/clears
turnip/kgsl: fix compilation after perfetto introduction
turnip: consider multiview_mask when clearing depth-stencil attachment
turnip: Move to common DEFINE_HANDLE_CASTS casting macro
turnip: clamp per-tile scissors to max viewport size in binning pass
turnip: fix vbs emission when there are holes in bindings
ir3: remove obsolete assert for intrinsic_store_output in tess
turnip: do nothing on dispatch with zero total workgroups
ir3: support source modes for resinfo.b
ir3/freedreno: handle non-uniform resinfo
ir3/freedreno: handle non-uniform a1en instructions
turnip: fix streamout buffer offset calculations
ir3/ra: Check register file upper bound when updating preferred_reg
tu: fix rast state allocation size on a6xx gen4
Dave Airlie (134):
lvp: fixup multi draw memcpys
lavapipe: fix multi-draw regression in shader parameters test
lavapipe: fix indexed multi draw draw_id increment
draw: handle resetting draw_id between instances.
softpipe/aniso: move DDQ calculation to after scaling.
wl/shm: don’t fetch formats if not requested.
clover/il: return IL only for spirv and correct length
gallivm: add anisotropic filter weight table.
draw: add shader access to aniso filter table.
llvmpipe: add filter table shader accessor
gallivm: add support for anisotropic sampling.
llvmpipe: add support for max aniso query.
draw: add sampler max_aniso query.
llvmpipe: enable GL_ARB_texture_filter_anisotropic
llvmpipe/virgl/ci: update traces for aniso
docs: update anisotropic info for softpipe/llvmpipe/lavapipe
crocus/gen4-5: fix ff gs emit on VS vue map change.
llvmpipe/linear: fix ppc64/s390 build
llvmpipe: add some extra linear rast checks.
llvmpipe: add support for time elapsed queries.
llvmpipe: rework query fence signalling for get_query_result_resource
gallivm/img: use uint for image coord builder.
draw/llvmpipe: multiply polygon offset units by 2
teximage: return correct desktop GL error for compressedteximage
crocus/gen4: restrict memcpy mapping to gen5
intel/fs: restrict max push length on older GPUs to a smaller amount
intel/decode: add gfx4 constant buffer decode
intel/decode: add gfx4 vertex shader decode
crocus/gen45: fix mapping compressed textures
intel/genxml: fix raster operation field in blt genxml
crocus: add support for set alpha to one with blt.
virgl: disable anisotropic filtering.
virgl: add support for anisotropic texture filtering
ci: bump to latest virglrenderer for anisotropic support
clover/llvm: turn off optional CL 3 features.
nir/libclc: handle null callee name when lowering
vtn: add support for atomic flag test/set/clear
nir: add 32-bit bool of fisfinite
nir: add fisnormal lowering
gallivm: handle fisfinite/fisnormal
clover: fix api zero sized enqueue
clover: return CL_INVALID_PLATFORM properly.
clover: add kernel attributes support for SPIR-V
clover: fix compilation with clang + llvm 12.
clover/nir: don’t convert to NIR on library link
clover: only return CLC version as 1.2 (even for 3.0)
llvmpipe: add support for user memory pointers
lavapipe: add host ptr support.
docs: add llvmpipe host memory extensions
crocus/blt: add pitch/offset checks to fix blt corruption
crocus: align staging resource pitch on gen4/5 to allow BLT usage.
intel/vec4: sel.cond writes the flags on Gfx4 and Gfx5
draw: handle primitive ID for quads/quad strips.
draw/gs: add clipvertex support for compatibility
draw/tess: add clipvertex support for compatibility
draw: add vertex color clamping to gs/tes
llvmpipe: enable GL compatibility profiles
gallivm: don’t lower local invocation index in frontend
llvmpipe/cl: limit kernel input size.
gallivm: fix idiv/irem for 8/16/64-bit and 32-bit INT_MIN/-1
gallivm: fix non-32 bit popcounts.
llvmpipe: init renderer string once to avoid races.
vulkan/wsi/sw: wait for image fence before submitting to queue
crocus: copy views before adjusting
crocus: drop u_primconvert header.
crocus: add missing line smooth bits.
crocus: add missing fs dirty on reduced prim change.
vulkan/wsi: add support for detecting mit-shm pixmaps.
vulkan/wsi/sw: add support for using host_ptr for shm pixmaps.
vulkan/wsi/sw: add mit-shm support for pixmap allocation
meson: fix regression finding shm dep
llvmpipe/fs: fix multisample depth/stencil fs writes.
llvmpipe: consolidate scissor plane code between line/tri
llvmpipe/scissor: rewrite scissor planes interaction.
llvmpipe: adjust scissor planes for multisample.
gallium: add a sample0 only option to blitter.
u_blitter: add support for sample0 only resolves.
lavapipe: VK_KHR_depth_stencil_resolve support
crocus/gen7: add missing IVB/GT2 geom shader workaround.
intel/decode/gfx6: add support for gfx6 CC/VIEWPORT pointers.
gallivm/ssbo: fix up dynamic indexed ssbo load/stores/atomics
gallivm/ssbo: cast ssbo index to int type.
lavapipe: enable dynamic index ubo/ssbo
llvmpipe/cs: rework thread pool for avoid mtx locking
gallivm/coro: use a phi instead of alloca
llvmpipe: shorten hold time on the screen mutex
llvmpipe/cs: rework coroutine context handling (v2)
gallivm: add initial support for 16-bit float builder.
gallivm/nir: handle conversion to 16-bit texel fetch
gallivm/nir: fix f2b32
gallivvm/nir: handle non-32bit mask scatter stores
gallivm: add 16-bit sin/cos via llvm intrinsic
llvmpipe: lower_flrp16
gallivm/nir: handle 16-bit exp/lod using intrinsics.
gallivm/nir: call pow with correct flt builder
gallivm/nir: pass the correct float builder to ddx/y
gallivm: increase tgsi nesting call stack size
gallivm: use llvm intrinsics for 16-bit round/trunc/roundeven
llvmpipe: enable FP16 and update CL + traces piglit results.
lavapipe: enable KHR_shader_float16_int8
gallivm/nir: handle subgroup reduction across all types
lavapipe: enable KHR_shader_subgroup_extended_types
docs: update docs for new llvmpipe/lavapipe features
lavapipe: enable KHR_spirv_1_4
lavapipe: fix vertex attributes/descriptor binding
lavapipe: don’t access pColorBlendState when not legal
gallium/format: move two vertex formats into the proper place.
lavapipe/ci: drop some fails I fixed recently
lavapipe: move to 1.2 features/properties structs.
gallivm/nir: fix subgroup invocation read.
lavapipe: enable vulkan 1.2 support.
lavapipe: move to new shared features/properties
lavapipe: cleanup image create function.
lavapipe: fixup image binding flags.
llvmpipe: overhaul fs/cs variant keys to be simpler.
gallivm: use pmulhrsw to make aos sampling more accurate.
crocus/gen6: don’t reemit the svbi when debugging
crocus/query: don’t loop on ready status after gpu hang.
gallivm/format: clamp SINT conversion rather than truncate.
llvmpipe/cs: change submission pattern for threadpool
llvmpipe: fix 4-bit output scaling.
lvp/fence: quick fix to previous commit.
device_select: close dri3 fd after using it.
wsi/x11: cleanup properly after mit shm paths are used.
Revert “lvp/fence: quick fix to previous commit.”
lavapipe: fix fence handling around wsi submission
crocus: Honor scanout requirement from DRI
crocus/gen5: reemit shaders on gen5 after new program cache bo.
crocus/gen5: add dirty flags for urb fences.
llvmpipe: fix userptr for texture resources.
lavapipe: drop EXT_acquire_xlib_display
vulkan/wsi: set correct bits for host allocations/exports for images.
llvmpipe: disable 64-bit integer textures.
llvmpipe: fix compressed image sizes.
Derek Foreman (2):
egl/wayland: Support RGBA ordered formats
egl/wayland: Properly clear stale buffers on resize
Dmitry Baryshkov (1):
freedreno/regs: add bit to control continuous clock with 7nm PHYs
Dylan Baker (19):
VERSION: bump version for 21.3 development cycle
docs/relnotes/new_features: empty for next release cycle
docs: update calendar for 21.2.0-rc1
docs: mark mesa 21.0 as done
freedreno/ir3: Add build id to the disassembler test
docs: add release notes for 21.2.0
docs: update calendar for 21.2.0-rc2
docs: update calendar for 21.2.0-rc3
docs: update calendar and link releases notes for 21.2.0
docs: Add calendar entries for 21.2 release.
bin/gen_release_notes: Add basic tests for parsing issues
bin/gen_release_notes: Don’t consider issues for other projects
bin/gen_release_notes: Fix commits with multiple Closes:
docs: add release notes for 21.2.2
docs/relnotes/21.2.2: Add SHA256 sum
docs: update calendar and link releases notes for 21.2.2
docs: add release notes for 21.2.3
docs” Add SHA256 sum for mesa 21.2.3
docs: update calendar and link releases notes for 21.2.3
Ed Baker (1):
frontends/va: Fix test_va_api VAAPIDisplayAttribs tests
Ed Martin (1):
winsys/radeonsi: Set vce_encode = true when VCE found
Eduardo Lima Mitev (1):
turnip: Add support for VK_VALVE_mutable_descriptor_type
Ella-0 (13):
v3dv: Add is_unorm, is_snorm and is_float format functions
v3dv: Implement VK_EXT_custom_border_color
v3dv: implement VK_EXT_color_write_enable
v3dv: Implement VK_EXT_pipeline_creation_cache_control
v3dv: Implement VK_EXT_provoking_vertex
v3dv: Implement VK_EXT_pipeline_creation_feedback
v3d/compiler: Handle point_coord_upper_left
v3d: Don’t handle PIPE_SPRITE_COORD_UPPER_LEFT twice
v3dv: Expose correct point size granularity
v3dv: Implement VK_EXT_vertex_attribute_divisor
ci/v3dv: Update fails with multiview failing with points
v3d: add R10G10B10X2_UNORM to format table
v3dv: enable VK_KHR_surface_protected_capabilities
Emma Anholt (233):
nir: Validate after deserialization.
nir_to_tgsi: Fix image declarations.
gallium/ttn: Add a debug flag for dumping the shaders.
freedreno/ir3: Reduce choose_instr_dec() and _inc() overhead.
gallium/ureg: Sort the output decls.
freedreno: Lock access to msm_pipe for RB object suballocation.
ci/freedreno: Enable the MSAA deqp tests.
gallivm: Default brilinear filtering to off.
gallivm: Always take the per-pixel LOD path for cubemaps.
i915g: Add support for shader-db.
nir_to_tgsi: Pack our tex coords into vec4 nir_tex_src_backend[12].
nir_to_tgsi: Add support for TXP.
nir_to_tgsi: Add support for HW atomics.
nir_to_tgsi: Declare buffers for all of num_ssbos.
nir_to_tgsi: Add support for nir_intrinsic_load_sample_pos.
turnip: Fix assertions on checking mutable combined samplers support.
gallium/dri2: Make dri_init_options just init DRI options.
gallium/driconf: Allow the driver to parse the driconf options.
ci: Stop disabling filter hacks for llvmpipe.
ci/i915: Update deqp expectations for another test passing.
ci: Uprev deqp-runner and use “suite” support to merge softpipe runs.
ci/llvmpipe: Use the deqp-runner suite support to consolidate jobs.
ci/i915g: Merge the two dEQP runs together.
ci: Save dEQP results on all tests.
ci/virgl: Use deqp-runner suite support to reduce CI job count.
ci/zink: Use deqp-runner suite support to reduce the CI job count.
ci: Update piglit to 4545a28cd8fea03fbab0e5f90bfbd812c32f3be1
ci/freedreno: Clear out TF API errors xfails.
freedreno/a5xx: Disable TF when pausing or transitioning to non-TF.
freedreno/a5xx: Don’t try to emit FS images in binning command streams.
ci/freedreno: Mark border_color as passing on a5xx.
ci/a5xx: Skip some piglit stress tests that destabilize CI.
ci/freedreno: Organize, fill out, and document our VK xfails.
ci/freedreno: Generalize the spirv_ids_abuse skips.
ci/freedreno: Clean up and fill out the tess timeout annotations.
ci/freedreno: Skip the slow dEQP-VK.ubo.random.all_shared_buffer.48 in CI.
ci/freedreno: Add jobs to manually do a full VK on freedreno.
i915g: Use the devmaster quadratic approximation for sin/cos.
i915g: Reapply clang-format.
nir: Move phi src setup to a helper.
i915g: Make the 1D workaround keep TXP’s .w channel in the right spot.
i915g: Add support for blitting compressed textures.
i915g: Add missing support for sRGB S3TC.
i915g: Fix up the format mapping for DXT1_*RGB
i915g: Add support for FXT1.
i915g: Fix 3D texture layouts for width != height.
i915g: Implement cube/3d texture_subdata() as a series of per-layer maps.
ci/turnip: Add a new flake from running more of the CTS.
ci/freedreno: Move freedreno’s deqp testing to suite support.
freedreno/a6xx: Apply the cube image size lowering to GL, too.
freedreno/ir3: Only lower cube image sizes once.
freedreno/ir3: Use the resinfo path for ssbo sizes on GL, too.
freedreno/ir3: Move a6xx’s get_ssbo_size shl to NIR.
freedreno/a6xx: Skip setting up image dims constants.
freedreno/a5xx: Use ST4_ constants for SSBO/image state types.
freedreno/a5xx: Reduce packet emits for SSBO state.
ci/freedreno: Mark a new flaky SSBO length test.
ci/freedreno: Flake the rest of the pbuffer/window dEQP-EGL tests.
i915g: Fix polygon offset by telling draw the Z format.
i915g: Correct PIPE_SHADER_CAP_MAX_TEMPS.
i915g: Reduce ARB_fp max tex indirections to match i915c.
i915g: Clear some xfails that are now skips.
i915g: Add comments explaining various xfails.
i915g: clang-format fixup.
freedreno/ir3: Apply the a6xx samgq workaround to TES/TCS/GS as well.
freedreno/ir3: Align driver param upload size/offset for indirect uploads.
freedreno/a6xx: Sync TFB BO access against prior TFB writes.
ci/lavapipe: Add a fractional run with ASan
ci/llvmpipe: Add a fractional ASan run.
nir: Set .driver_location for GLSL UBO/SSBOs when we lower to block indices.
nir/nir_lower_uniforms_to_ubo: Set the explicit stride of the UBO 0 uniform.
nir_to_tgsi: Use explicit sizes of NIR variables for UBO declarations.
ci/freedreno: Annotate a bunch of piglit fails/crashes.
ci/freedreno: Add a bunch of recent a530 and a630 flakes.
ci/v3dv: generalize the buffer_access.through_pointers flakes.
ci/freedreno: Fix xfail update for arb_draw_indirect.
freedreno/ir3: Don’t use isam for coherent image loads on a6xx.
freedreno/ir3: Clarify what’s going on in a4xx SSBO atomics.
freedreno/ir3: Refactor a3xx ibo/ssbo load/store instruction XML.
freedreno/ir3: Add encode/decode support for a5xx’s LDIB.
freedreno/ir3: Use LDIB for coherent image loads on a5xx.
osmesa: Add a unit test for resizing buffers.
cso: Revert using FS sampler count for other stages at context unbind.
mesa/st: Add an assertion for finalize_nir versus PIPE_CAP_TEXCOORD.
i915g: Simplify the process of texcoord mapping to TGSI semantics.
i915g: Expose PIPE_CAP_TGSI_TEXCOORD.
i915g: Add finalize_nir.
mesa/st: Add an optional GLSL link fail msg to finalize_nir.
i915g: Reject non-unrolled loops or non-flattend IFs at link time.
ci/iris: Mark create_context-no_error as failing.
ci/iris: Unmark dma_buf_import_export tests as failing.
ci/iris: Consistently use .test-manual-mr for our unstable hardware.
ci/iris: Switch GL/GLES testing to suites.
freedreno/a6xx: Emit a WFI after event writes flushing CCU.
ci/freedreno: Fix typo in glx-tfp flake annotation.
ci/freedreno: Mark a630 basic-glsl-misc-fs as flaky.
ci/freedreno: Skip slow SizedDeclarationsPrimitive in CI.
llvmpipe: Free CS shader images on context destroy.
llvmpipe: Fix leak of CS local memory with 0 threads.
llvmpipe: memcpy user_buffers at set_constant_buffer time.
nir_to_tgsi: Fix indirect addressing of atomic counters.
nir_to_tgsi: Don’t forget to add sampler views with our samplers.
nir_to_tgsi: Add support for memory_barrier_tcs_patch.
nir_to_tgsi: Clean up some unnecessary pointers-to-uregs.
nir_to_tgsi: Switch ssa_temp[] to be a ureg_src.
nir_to_tgsi: Allow SSA defs to include swizzles, abs, and neg.
mesa: Move the advanced blend bitmask to shader_info.
nir: Add a nir_instr_free() to replace ralloc_free(instr).
nir: Pull the instr list free function out to a helper.
nir/from_ssa: Use nir_instr_free() to free instrs instead of ralloc.
nir: Consistently pass the shader to the shader arg of instr creation.
nir: Consistently pass the instr to nir_src_copy().
nir: Add all allocated instructions to a GC list.
nir/lower_phis_to_scalar: Use nir_instr_free() to free instrs.
nir/tests: Fix transmuting an SSA dest to be non-SSA
nir: Switch from ralloc to malloc for NIR instructions.
nir: Drop the unused instr arg for src/dest copy functions.
ci/freedreno: Drop minetest from a3xx trace testing.
freedreno: Precompute resource pointer hash values.
freedreno: Use TC’s flag for whether get_query is in the driver thread.
freedreno: Move the batch cache to the context.
freedreno: Remove the submit lock locking.
freedreno: Use a BO bitset for faster checks for resource referenced.
freedreno: Remove dead fd_batch_reset().
ci/i915g: Clarify failure happening in fbo-fragcoord2.
mesa/st: Allow loops in GLSL when NIR is enabled, even if the HW can’t.
freedreno: Fix autotune regression since batch-cache rework.
freedreno: Assert to check for the previous regression.
ci/freedreno: Add some cubearray piglit flakes on a630 I noticed.
ci/baremetal: Retry if our network device spontaneously fails.
ci/freedreno: Update restricted trace sha1s.
nir_to_tgsi: Remove the abs on fcsel’s bool src.
freedreno/a5xx+: Rename GRAS_CNTL/RB_RENDER_CONTROL0 IJ_LINEAR_* bits.
freedreno/a5xx+: Set the IJ_LINEAR_* request bits if we need the regs.
tu: Move core features definitions to a helper function.
tu: Deduplicate extension/core feature flags.
tu: Add GetPhysicalDeviceFeatures2() support for more VK 1.2 core features.
tu: Move VK 1.1 core properties to a helper function and use macros for exts.
tu: Support VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROTECTED_MEMORY_PROPERTIES.
turnip: Move physical device 1.2 properties to a helper function.
mesa: Throw an error for compressed glGenerateMipmap on GLES2 contexts.
mesa: Prioritize checking for GLES2’s uniform transpose error.
mesa: Fix missing CopyTexImage formats for OES_required_internalformat.
ci/vc4,i915g: Add links to VK-GL-CTS issues for some of our xfails.
vulkan: Add helpers for filling exts for core features and properties.
vulkan: Support PHYSICAL_DEVICE_1_n_ features/properties in the helpers.
turnip: Use the shared now-in-core feature/prop extension helper functions.
anv: Use the shared now-in-core feature/prop extension helper functions.
radv: Use the shared now-in-core feature/prop extension helper functions.
vulkan: Update the XML and headers to 1.2.193
turnip: Set the VK_DRIVER_ID to our new enum.
turnip: Swizzle in 0, 1 for D24S8 STENCIL_ASPECT sampling.
turnip: Disable VK_EXT_display_control.
i915g: Improve debug output for the fresh-batch overflow case.
i915g: Remove dead VBUF_USE_POOL code.
i915g: Unifdef VBUF_MAP_BUFFER.
i915g: Use the non-vbuf code path by default to fix index overflows.
ci/freedreno: Disable flaky a530 for now.
gallium/dri: Make YUV formats we’re going to emulate external-only.
turnip: Match the blob’s format for vendorID and deviceID.
turnip: Expose a device name similar to the blob.
freedreno/rnndec: Fix use of undefined value_orig in the !ti case.
freedreno/rnndec: Avoid making 0-length variable length arrays.
freedreno/afuc: Avoid ubsan warns about shifting to the top bit of ‘int’
freedreno: Fix UBSan failures in cffdec’s (uint8_t)x << 24
freedreno: Reuse u_math.h instead of open coding ALIGN/ARRAY_SIZE.
freedreno: Reuse u_math.h instead of open coding uif().
freedreno: Move afuc tests to meson unit tests.
freedreno: Move crashdec/cffdec tests to be meson unit tests.
freedreno: Move the headergen2 test to be meson unit tests.
panfrost: Disable flaky piglit job for now.
ci/freedreno: Restart the run if cheza spontenously reboots.
freedreno/tools: Fix build failure when cffdump isn’t built but tests are.
freedreno/a6xx: Move the format table to common code.
freedreno/a6xx: Add int/scaled/snorm vertex formats to match turnip.
freedreno/a6xx: disable vertex fetch support flag for b8g8r8a8_srgb.
freedreno/a6xx: Add support for EXT_texture_sRGB_R8/RG8.
freedreno/a6xx: Drop texturing support from other scaled formats.
freedreno/a6xx: Add some more 16-bit rgb/rgba swaps to our format tables.
freedreno/a6xx+: Add support for the R8G8_R8B8 and G8R8_B8R8 formats.
util/format: Add an RGB planar format for YV12, like we have for NV12.
freedreno/a6xx: Put R8_G8_B8_420_UNORM in the format table.
freedreno/a6xx: Use fd6_pipe2tex() for the 2D src format.
freedreno/a6xx: Make the format table const.
freedreno/a6xx: Rewrite the format table format/swap helpers.
freedreno/a6xx: Add support for A/XRGB1555 formats.
freedreno/a6xx: Enable UBWC for RGBA5551 (and 1555) textures.
turnip: Give D32_SFLOAT_S8_UINT a native format.
turnip: Switch tu_format internals to using pipe_format more.
turnip: Do format lookups from the fd6 format table and cross-check.
turnip: Replace our format table with fd6_format_table.
i915g: Check for the scanout-layout conditions before setting level info.
mesa/st: Don’t bump locations of patch vars for !PIPE_CAP_TEXCOORD.
nir_to_tgsi: Include txf_ms’s sample index.
nir_to_tgsi: Add support for load_output/load_per_vertex_output.
gallium/ureg: Sort the input decls, too.
nir_to_tgsi: Add support for declaring image arrays.
nir_to_tgsi: Add support for load_barycentric_sample.
nir_to_tgsi: Add support for nir_intrinsic_load_barycentric_at_sample.
nir_to_tgsi: Turn GS PRIMID into an input instead of a sysval.
nir-to-tgsi: Avoid emitting TXL just for lod 0 on non-vertex shaders.
nir_to_tgsi: Sort FS output declarations to avoid virglrenderer bugs.
nir_to_tgsi: Add a workaround for virgl UBO array dynamic indexing.
nir_to_tgsi: Force the TXQ LOD argument to be scalar.
virgl: Add support for NIR shaders when VIRGL_DEBUG=nir.
turnip: Plug the vendor/device ID into the pipeline cache fields, too.
turnip: Fix allocation failure handling around device->name.
turnip: Free disk cache on pdev init failure.
ci/freedreno: Move the other a530 test jobs to test-manual-mr.
ci/freedreno: try to fix the a630 cubearray flake’s regex.
ci/freedreno: Disable the minetest trace due to flaky shader code.
ci: Update deqp to vulkan-cts-1.2.7.1.
ci: Update piglit to 7d7dd2688c214e1b3c00f37226500cbec4a58efb.
radeonsi: Fix leak of screen->perfcounters.
Revert “ci: Add osmesa to Windows GitLab CI”
ci/deqp-runner: Drop SUMMARY_LIMIT env var.
ci/deqp-runner: Simplify the –jobs argument setup.
ci/deqp-runner: Use new deqp-runner’s built-in renderer/version checks.
ci/deqp-runner: Drop silly CSV env vars.
ci/deqp-runner: Move remaining asan runs to –env LD_PRELOAD=
ci/deqp-runner: Drop LD_LIBRARY_PATH=/usr/local for libkms workaround.
ci/deqp-runner: Don’t start GPU hang detection for making junit results.
ci/deqp-runner: Move more non-suite logic under the non-suite ‘if’.
ci/piglit-runner: Fix funny indentation of the piglit-runner command.
ci/deqp-runner: Rename the deqp-drivername-*.txt files to drivername-*.txt
ci/piglit-runner: Merge piglit-driver-*.txt files into driver-*.txt.
ci: Enable testing radeonsi’s libva using libva-util unit tests.
freedreno: Fix gmem invalidating the depth or stencil of packed d/s.
freedreno/a6xx: Fix partial z/s clears with sysmem.
freedreno/a6xx: Don’t try to generate mipmaps for SNORM with our blitter.
freedreno/ir3: Fix off-by-one in prefetch safety assert.
freedreno/a6xx: Emit a null descriptor for unoccupied IBO slots.
mesa/st: Disable NV_copy_depth_to_color on non-doubles-capable HW.
Emmanuel Gil Peyrot (3):
radv: Support device initialization without LLVM dependencies
radv: Support shader compilation without LLVM dependencies
radv: Allow building when LLVM isn’t enabled
Enrico Galli (11):
microsoft/spirv_to_dxil: Adding continue opt pass to fix DXIL loop gen
nir_lower_readonly_images_to_tex: Fix typeo on image arrays
microsoft/compiler: Add support for arrays to image_store
microsoft/compiler: Correctly flag when using raw buffers
microsoft/spirv_to_dxil: Enable support for shared memory
microsoft/compiler: Add support for local_invocation_index
spirv_to_dxil: Convert out parameters to a single object
nir: Add CAN_REORDER to load_ubo_dxil
spirv_to_dxil: Add support for nir_intrinsic_load_num_workgroups
spirv_to_dxil: Add support for non-zero vertex and instance indices
nir_to_dxil: Add tagging raw SRVs in shader flags
Eric Engestrom (45):
docs: add release notes for 21.1.5
docs: update calendar and link releases notes for 21.1.5
docs: drop duplicate `21.1` branch name from release calendar
docs: add release notes for 21.1.6
docs: update calendar and link releases notes for 21.1.6
pick-ui: drop assert that optional argument is passed
pick-ui: show nomination type in the UI
pick-ui: show commit date
docs: add release notes for 21.1.7
docs: update calendar and link releases notes for 21.1.7
python: explicitly require python3
gitlab-ci: stop installing python-is-python3 package
python: drop python2 support
Revert “python: Explicitly add the ‘L’ suffix on Python 3”
isl: drop comment about “python 2 vs 3” as it doesn’t apply anymore
isl: drop left-over comment
glsl/tests: remove some dead code
python: drop explicit output_encoding=’utf-8’ in mako templates
docs: add release notes for 21.1.8
docs: update calendar and link releases notes for 21.1.8
docs: add plan for 21.3.x release cycle
docs: shorten “last release” note to fit on the website without horizontal scrolling
bin/khronos-update.py: update the branch name (s/master/main/)
bin/khronos-update.py: add upstream for vulkan_directfb.h & vulkan_screen.h
gitlab: convert old REVIEWERS into GitLab’s CODEOWNERS
CODEOWNERS: add SWR maintainers
CODEOWNERS: add intel group
CODEOWNERS: add android build system
CODEOWNERS: add @alyssa for Asahi and Panfrost
CODEOWNERS: add @bbrezillon for src/panfrost/vulkan/
CODEOWNERS: add @jenatali for Microsoft & D3D12
egl: sync eglext.h & egl.xml from Khronos
egl: implement EGL_EXT_present_opaque on wayland
VERSION: bump for 21.3.0-rc1
.pick_status.json: Update to 86b3d8c66ce17ddcaefa5bdea68882cc03a57f15
.pick_status.json: Mark 7a2e40df5e8490de739c66865f90fa6804e41f6d as denominated
VERSION: bump for 21.3.0-rc2
.pick_status.json: Update to 4856586ac605e89ee6c128b1a190f000311b49ba
VERSION: bump for 21.3.0-rc3
.pick_status.json: Update to c356f3cfce9459dc1341b6a2a0fd5336a9bdcc3c
VERSION: bump for 21.3.0-rc4
.pick_status.json: Update to 549924d53e359c04d7c14b12990178c86d3aad2d
meson: drop duplicate addition of surfaceless & drm to the list of platforms
VERSION: bump for 21.3.0-rc5
.pick_status.json: Update to ba6d389fa7a0ac512cb9d4cdd21efde990f041b1
Erico Nunes (2):
lima: avoid crash with negative viewport values
ci: enable CI for lima again
Erik Faye-Lund (52):
dxil: Set coord_components on the txf in lower_int_sampler
lavapipe: do not assert on more than 32 samplers
lavapipe: do not mark unsupported tests as crashing
gallivm: let nir_lower_tex handle projectors
gallivm: make rho-approximation opt-in instead of opt-out
gallivm: remove pointless no_filter_hacks flag
d3d12: split up root parameter update and set
microsoft/compiler: fix psv-output calculation
microsoft/compiler: harmonize num_psv_inputs with outputs
gallivm: use lp_build_log2_safe for pow
lavapipe: remove stale xfails
lavapipe: remove duplicate xfail with typo
lavapipe: lower mipmapPrecisionBits to 4
gallivm: remove code to force nearest s/t interpolation
llvmpipe: take intersection with bbox for non-legacy points
st/mesa: correct point_tri_clip for gles2
gallivm: fix texture-mapping with 16-bit result
draw: fix stippling of fractional lines
gallium/nir/tgsi: fixup indentation
gallium/nir/tgsi: initialize file_max for inputs
draw: improve numerical stability in clipper
llvmpipe: use preferred attribute interpolation for wide lines
llvmpipe: clamp z to 0..1 range when using polygon offset
llvmpipe: split coefficient calculation and store
llvmpipe: improve polygon-offset precision
lavapipe: fix reported subpixel precision for lines
draw/llvmpipe: correct exponent calculation for negative z
gallium/tgsi: remove unused helper
gallium/tgsi: rip out cylindrical wrap from ureg
gallium/tgsi: rip out cylindrical wrap support
softpipe: rip out cylindrical wrap support
llvmpipe: rip out cylindrical wrap support
microsoft/compiler: remove needless error-returns
microsoft/compiler: return errors from get_n_src
microsoft/compiler: trivial fixes to error-handling
Revert “zink: always init bordercolor value for sampler”
zink: do not warn about rare features until used
zink: initialize pQueueFamilyIndices
zink: avoid overflow when calculating size
zink: do not try to dereference null-key
zink: do not dereference null-pointer
zink: pctx can’t be null here
zink: return false on failure
zink: remove needless NULL-check
zink: avoid memcmping null pointers
zink: avoid checking if src is const twice
zink: give each major intrinsic it’s own emit function
zink: remove needless scope
zink: remove incorrect ASSERTED macro
zink: clean up const-value handling for get_ssbo_size
zink: reduce scope of version-struct hack
zink: avoid generating nonsensical code
Esme Xuan Lim (1):
docs/panfrost: Fix link to use rst syntax
Felix DeGrood (2):
iris: add tile cache flush to iris_copy_region
anv: dirty only state impacted by blorp_exec
Filip Gawin (18):
docs: make most important part of bugs.rst easier to find
radeonsi: improve rounding of zmin
radv: improve rounding of zmin
nir: fix shadowed variable in nir_lower_bit_size.c
nir: fix ifind_msb_rev by using appropriate type
meson: add crocus to default group of drivers for x86/x86_64
nouveau: fix forward declaration of struct
nouveau: use bool literals instead of integers
glsl: use bool literals instead of integers
r300: fix usage of COVERED_PTR_MASKING_ENABLE for r500
r300: make global variables const (if possible)
r300: assert that array in translate_vertex_program is initialized
aco: cleanup assignment of unique_ptrs
r300: implement forgotten tgsi’s cases of textures
r300: fix UB caused by 1 << 31 and 2 << 30
r300: avoid searching for temp variable twice
nir: avoiding reading unitialized memory when using nir_dest_copy
r300: fixes for UB caused by left shifts
Francisco Jerez (12):
iris: Add read-only domain for VF cache.
iris: Annotate all BO uses through VF cache domain.
iris: Insert buffer-local memory barriers for VF reads.
iris: Add separate dirty bit for VBO flushes.
iris: Insert buffer-local memory barriers for indirect draw parameters.
iris: Add read-write domain for data cache.
iris: Use DATA domain barrier for shader images instead of OTHER domain.
iris: Insert buffer-local memory barriers for SSBO reads and writes.
iris: Insert buffer-local memory barriers for UBO reads.
iris: Use separate dirty bits for UBO and SSBO flushes.
iris: Track dirty UBOs per-stage for more targeted flushing.
iris: Make sure a bound resource is flushed after iris_dirty_for_history.
Georg Lehmann (3):
radv: Use c_msvc_compat_args.
aco: Use cpp_msvc_compat_args.
radv: Remove dead min waves code.
Gert Wollny (3):
mesa: Add support for EXT_clear_texture
mesa: Add EXT_texture_mirror_clamp_to_edge to extension table
mesa: signal driver when buffer is bound to different texture format
Greg V (1):
util: make util_get_process_exec_path work on FreeBSD w/o procfs
Guilherme Gallo (9):
gitlab-ci: enable testing on Intel Whiskey Lake (experimental)
gitlab-ci: enable testing on Intel Comet Lake (experimental)
gitlab-ci: Fix trace expectations for iris devices
gitlab-ci: Fix octopus device type and tag
gitlab-ci: Add sleep for every `scheduler.jobs.logs` call
gitlab-ci: Implement a simple timeout detection for LAVA jobs
gitlab-ci: refactor timeout constants and tweak timeout values
ci: Uprev deqp-runner to 0.9.0
ci: Update linux kernel to v5.15
Gurchetan Singh (3):
drm-uapi: virtgpu_drm.h: context init feature
virgl/drm: query for context init ioctl and supported capset ids
virgl/drm: explicit context initialization
Hoe Hao Cheng (2):
zink: make codegen compatible with python 3.5
zink/codegen: do not enable extensions based on vulkan version
Hyunjun Ko (4):
tu: allow dynamic primitive topology with tessellation
freedreno/a5xx,a6xx: rename MSAA_ENABLE to LINE_MODE in GRAS_SU_CNTL
turnip: enable VK_EXT_line_rasterization
turnip: enable strictLines
Iago Toral Quiroga (40):
ci: disable Broadcom CI
v3dv: remove more dead clearing code
v3dv: refactor meta copy/clear code
v3dv: remove unused layer field from struct rcl_clear_info
v3dv: improve TLB layered image clears
v3dv: allow limiting amount of tile state allocated
v3dv: don’t overallocate tile state for meta TLB operations
v3dv: don’t emit frame setup more than once for multilayered framebuffers
v3dv: fix I/O lowering for GS
v3dv: drop unused parameters
v3dv: store multiview info in our render pass data
v3dv: move all our NIR pre-processing to preprocess_nir
v3dv: inject a custom passthrough geometry shader for multiview pipelines
broadcom/compiler: implement nir_intrinsic_load_view_index
v3dv: broadcast multiview draw commands
v3dv: don’t merge subpasses with different view masks
v3dv: use correct number of layers for multiview
v3dv: skip processing tiles for layers that are not in the view mask
v3dv: track first and last subpass that use a view index
v3dv: fix query error handling
v3dv: implement interaction of queries with multiview
v3dv: expose VK_KHR_multiview
v3dv: fill in drmFormatModifierTilingFeatures
v3dv: handle IMAGE_DRM_FORMAT_MODIFIER_EXPLICIT_CREATE_INFO_EXT
docs: flag VK_KHR_multiview as implemented for v3dv
broadcom/compiler: add a vir_get_cond helper
broadcom/compiler: Flags are per-thread state in V3D 4.2+
broadcom/compiler: make spills of conditional writes also conditional
broadcom/compiler: rewrite partial update liveness tracking
v3d,v3dv: add options to force 32-bit or 16-bit TMU precision
v3dv: don’t try to access pColorBlendState if rasterization is disabled
v3dv: add API entry points for sampler Ycbcr conversions
vulkan: allow creating color views from depth/stencil images
v3dv: make v3dv_image derive from vk_image
v3dv: use subresource helpers in more places
v3dv: make v3dv_image_view derive from vk_image_view
v3dv: honor VkPhysicalDeviceFeatures2 in pNext chain of VkDeviceCreateInfo
broadcom/compiler: don’t enable early fragment tests if shader writes Z
v3dv: start using Broadcom’s device identifiers
broadcom/compiler: fix assert that current instruction must be in current block
Ian Romanick (65):
nir/gcm: Clear out pass_flags before starting
util/queue: Don’t crash in util_queue_destroy when init failed
iris: Add a comment for iris_uncompiled_shader::nir
iris: Fix return type of iris_compile_*
iris: Unify iris_delete_[shader stage]_state functions
iris: Unify iris_create_[shader stage]_state functions
iris: Merge iris_create_[shader stage]_state funcs into iris_create_shader_state
iris: Ref count the uncompiled shaders
iris: Extract allocation bits from iris_upload_shader to iris_create_shader_variant
iris: Allocate shader variant in caller of iris_upload_shader
iris: Add the variant to the list as early as possible
iris: Don’t pass the shader key to iris_compile_[shader stage]
iris: add sync_compile option
iris: Enable threaded shader compilation
iris: Split iris_upload_shader in two
intel/compiler: Add id parameter to shader_debug_log callback
intel/compiler: Add id parameter to shader_perf_log callback
mesa: Fix tiny race condition in _mesa_debug_get_id
util: Add and use functions to calculate min and max int for a size
isl: Use CLAMP macro instead of MIN of MAX
nir/opcodes: Use u_intN_(min|max)
Revert “nir/algebraic: Convert some f2u to f2i”
intel/fs: sel.cond writes the flags on Gfx4 and Gfx5
gallium: Remove “optimize” parameter from pipe_screen::finalize_nir
intel/compiler: Document and assert some aspects of 8-bit integer lowering
nir/algebraic: Optimize some extract forms resulting from 8-bit lowering
intel/fs: Allow copy propagation between MOVs of mixed sizes
intel/fs: Emit better code for u2u of extract
nir/algebraic: Remove spurious conversions from inside logic ops
nir: intel/compiler: Add and use nir_op_pack_32_4x8_split
intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms
util/xmlconfig: Make unit tests more resilient against user env settings
util/xmlconfig: Test values set via the environment
nir/lower_bit_size: Support add_sat and sub_sat
nir/opcodes: Add integer dot-product opcodes
nir/algebraic: Basic patterns for dot_4x8
intel/compiler: Basic support for DP4A instruction
nir/algebraic: Add lowering for dot_4x8 instructions
nir/algebraic: Add some extract optimizations
spirv: Update headers and metadata from latest Khronos commit
spirv: Add support for SPV_KHR_integer_dot_product
intel/fs: Refactor some cmod propagation tests
intel/fs: Remove redundant inst->opcode checks in cmod prop
intel/fs: Add many cmod propagation tests involving MOV instructions
intel/fs: Fix a cmod prop bug when the source type of a mov doesn’t match the dest type of scan_inst
intel/compiler: Move type_is_unsigned_int to brw_reg_type.h
intel/fs: cmod propagate from MOV with any condition
intel/fs: Remove condition-based restriction for cmod propagation to saturated operations
intel/fs: Remove after parameter from test_saturate_prop
intel/fs: Remove type-based restriction for cmod propagation to saturated operations
anv: Enable KHR_shader_integer_dot_product
nir/lower_gs_intrinsics: Return progress if append_set_vertex_and_primitive_count makes progress
nir/lower_gs_intrinsics: Make nir_lower_gs_intrinsics be idempotent
iris: crocus: Use shader_info::is_arb_asm flag
iris: Calculate uses_atomic_load_store after all lowering
nir/edgeflags: Add a flag to indicate the edge flag input is needed
iris: Eliminate iris_uncompiled_shader::needs_edge_flag
iris: Move iris_set_max_shader_compiler_threads and iris_is_parallel_shader_compilation_finished
iris: Add finalize_nir
spirv: Silence unused parameter warnings in vtn_alu.c
spirv: Minor cleanup in SpvOpFOrdNotEqual
spirv: SpvOpFUnordNotEqual doesn’t need special treatment
spirv: Generate shorter code for SpvOpFUnord comparisons
nir/algebraic: Small optimizations for SpvOpFOrdNotEqual and SpvOpFUnordEqual
nir/loop_unroll: Always unroll loops that iterate at most once
Icecream95 (26):
pan/decode: Avoid undefined behaviour on shift in bits()
pan/gen_pack: Use 1U for unpacking log2 to avoid undefined behaviour
pan/bi: Print the clause of branch targets
pan/bi: Use padding bytes for checking whether to stop disassembly
pan/bi: Fix infinite loop parsing arguments for bifrost_compiler
pan/mdg: Analyze helper termination after scheduling
pan/bi: Use the computed scale for fexp NaN propagation
panfrost: Call primconvert and u_transfer_helper destroy functions
pan/bi,pan/mdg: Fix memory leak of hash tables
panfrost: Fix memory leaks for compute state
panfrost: Free TGSI tokens
panfrost: Free NIR when deleting shader state
pan/mdg: Reduce size of tex_opcode_props
panfrost: Fill tiler job padding again
panfrost: Add nocache debug flag for disabling the BO cache
panfrost: Only allow colour blit shaders to be killed
panfrost: drm-shim support
pan/bi: Extend bi_add_nop_for_atest for tilebuffer loads
lima: Enable PIPE_CAP_VERTEX_COLOR_UNCLAMPED
lima: Fix crashes for GPUs with more than four cores
lima: Improve error messages for unsupported GP operations
lima: Add a noop drm-shim
pan/bi: Don’t set dependencies for +BLEND in blend shaders
pan/mdg: Remove use of global variables in disassembler
panfrost: Add ASTC 3D texture format entries
pan/mdg: Use the correct swizzle for condition moves
Ilia Mirkin (7):
st/mesa: fix pbo download store image type
mesa: don’t return errors for gl_* GetFragData* queries
mesa: rgb10_a2 is never color-renderable in gles2
glsl: fix explicit-location ifc matching in presence of array types
freedreno: use OUT_WFI for emit_marker
a4xx: add some better documentation for compute registers
a4xx/computerator: add initial backend
Italo Nicola (6):
ci: skip minio login if PIGLIT_REPLAY_UPLOAD_TO_MINIO is not set
virgl/ci: switch glmark2 traces from .rdc to .trace
virgl/ci: stop overriding GL version when running traces
virgl/ci: enable some traces that were previously crashing
main: don’t always clamp pixels read from snorm buffers
panfrost: fix null deref when no color buffer is attached
Iván Briano (8):
anv: Don’t advertise unsupported shader stages
anv: fix some multisample lines_wide CTS tests
anv: Unbreak wide lines on HSW/BDW
anv: fix feature/property/sizes reported for fragment shading rate
anv: Allow unused VkSpecializationMapEntries
anv: Don’t copy the lineStipple values if lineStipple is not enabled
vulkan: fix handling of aliases in enum members
vulkan: Generate defines for aliases of promoted enums
James Park (1):
aco: Work around MSVC restrict in c99_compat.h
Jan Beich (1):
meson: disable -Werror=thread-safety on FreeBSD
Faith Ekstrand (192):
intel/dev: Handle CHV CS thread weirdness in get_device_info_from_fd
intel/dev: Put the device name in intel_device_info
intel/dev: Handle BSW naming issues
intel/dev: Add a max_cs_workgroup_threads field
intel/dev: Drop a bogus assert
nir: Better document the Boissinot algorithm in nir_from_ssa()
iris: Re-emit MEDIA_VFE_STATE for variable group size shaders
anv: Handle errors properly in anv_i915_query
intel: Pull anv_i915_query into common code
anv: Use intel_i915_query_alloc for memory regions
iris: Use intel_i915_query for meminfo
intel/dev: Use intel_i915_query_alloc in query_topology
intel/perf: Use intel_i915_query_flags instead of hand-rolling it
intel/eu: Start validating LSC message descriptors
anv: Assume syncobj support
anv: Drop unused sync_file and BO semaphore code
anv: Stop reference counting semaphores
glsl/nir: Use nir_ssa_undef() from nir_builder
nir: Set IMAGE_DIM and IMAGE_ARRAY on deref intrinsics
nir: Set src_components = -1 for image intrinsic deref sources
nir: Add a format field to _deref image intrinsics
nir/lower_subgroups: Handle down-casts in uint_to_ballot_type
nir/lower_image: Handle index and bindless image_size
nir/lower_tex: Add a lower_txs_cube_array option
radv,radeonsi: Do cube size divide-by-6 lowering in NIR
turnip: Replace tu_lower_image_size with nir_lower_image
intel/eu: Don’t validate LSC transpose on ops that don’t have it
ttn: Don’t handle texop_txf_ms_mcs
amd: Don’t handle nir_tex_src_ms_mcs
panfrost: Don’t handle nir_texop_txf_ms_mcs
nir: Suffix all the MCS texture stuff _intel
docs,nir: Document NIR texture instructions
intel/blorp: Use nir_texop_txl
nir/lower_tex: Rework invalid implicit LOD lowering
nir: Validate newly documented texture restrictions
anv/android: Rework our handling of AHardwareBuffer imports
nir: Removing uses of SSA defs destroys SSA liveness
nouveau: Use nir_lower_tex for projectors
anv/blorp: Drop some can_ycbcr checks
anv/blorp: Use the isl_surf for computing level_width/height in anv_image_ccs_op
anv: Rename anv_get_format_plane to anv_get_format_aspect
anv: Rework depth/stencil early return in anv_get_format_plane
anv: Add a get_format_plane helper and use it in image setup
anv: Use anv_get_format_plane in anv_get_image_format_features
anv: Use anv_get_format_plane for color image view setup
anv: Stop assuming planes are in aspect-bit-order
anv/image: Rework YCbCr image aspects
anv: Rework our aspect/plane helpers
anv: Make anv_image_aspect_to_plane take an anv_image*
intel/eu: Set scope to TILE for TGM flushes
meson/intel: Don’t build genxml tests on Android
meson: Intel drivers don’t require expat on Android
meson/glsl: Only run GLSL tests if can_run_host_binaries()
intel/vec4: Don’t override emit_urb_write_opcode for SNB GS
intel/perf: Use a char array for OA perf query data
anv/android: Pass the correct pointer type to vk_errorf
anv/android: Drop unused device variables
ci: Build ANV on Android
include/drm-uapi: Bump headers
anv: Use I915_MMAP_OFFSET_FIXED for LMEM platforms
iris: SMEM buffers on discrete platforms are coherent
iris: Use a tiny table to map mmap modes to offsets
iris: Add an assert to iris_bo_gem_mmap_legacy()
iris: Add a new IRIS_MMAP_NONE map type
iris: Use I915_MMAP_OFFSET_FIXED for LMEM platforms
anv: Use I915_USERPTR_PROBE when available
intel/isl: Explicitly set offset_B = 0 in get_uncomp_surf for arrays
intel/isl: Add units to view dimensions in isl_surf_get_uncompressed_surf
intel/isl: Better document isl_tiling_get_intratile_offset_*
intel/isl: Add a missing assert in isl_tiling_get_intratile_offset_sa
intel/isl: Use uint64_t for computed byte offsets
anv/image: Use planes[i]->primary_surface.isl.format in check_drm_format_mod
anv: Delete anv_image::format
vulkan: Add a vk_image struct
anv: Make anv_image derive from vk_image
anv,vulkan: Move anv_image_expand_aspects to common code
anv,vulkan: Move VkImageSubresource* helpers from ANV
vulkan: Refactor and better document vk_image_expand_aspect_mask
radv: Add asserts to vk_format_depth/stencil_only
vulkan,radv: Move vk_format_depth/stencil_only to common code
vulkan: Add a vk_image_view struct
anv: Make anv_image_view derive from vk_image_view
anv,vulkan: Move ANV image layout helpers to common code
anv,vulkan: Move drm_format_mod to vk_image
anv,vulkan: Add a vk_image::wsi_legacy_scanout bit
anv: Move compute_heap_size lower in the file
anv: Rework init_meminfo
anv: compute available memory in anv_init_meminfo
anv: Set CONTEXT_PARAM_RECOVERABLE to false
intel/compiler: Add unified barrier support for CS
intel/isl: Add more parameters to isl_tiling_get_info
isl/docs/tiling: Add Tile4 docs
intel/fs: Add support for atomic_fadd
anv: Advertise support for shaderBufferFloat32AtomicAdd
nir: Properly clean up nir_src/dest indirects
nir: Stop sweeping indirects
spirv: Handle the SubgroupSize execution mode
intel/fs: Handle required subgroup sizes specified in the SPIR-V
iris: Handle states=NULL in iris_bind_sampler_states
iris: Return 1 for PIPE_COMPUTE_CAP_IMAGES_SUPPORTED
panvk: Use vk_queue
panvk: Use vk_command_buffer
vulkan: Add the pCreateInfo to vk_queue_init()
anv: Drop anv_queue::flags
radv: Drop radv_queue::flags/queue_family_index/queue_idx
lavapipe: Drop lvp_queue::flags
turnip: Drop tu_queue::flags/queue_family_index/queue_idx
v3dv: Drop v3dv_queue::flags
panvk: Drop panvk_queue::flags/queue_family_index
vulkan/device: Add a common GetDeviceQueue2 implementation
vulkan/device: Add a common DeviceWaitIdle implementation
anv: Switch to common GetDeviceQueues2 and DeviceWaitIdle
radv: Switch to common GetDeviceQueues2 and DeviceWaitIdle
turnip: Switch to common GetDeviceQueues2 and DeviceWaitIdle
v3dv: Use the common GetDeviceQueue implementation
lavapipe: Simplify DeviceWaitIdle
lavapipe: Switch to common GetDeviceQueue and DeviceWaitIdle
panvk: Switch to common GetDeviceQueue and DeviceWaitIdle
intel/fs: Rework fence handling in brw_fs_nir.cpp
intel/fs: Ignore SLM fences if shared is unused
intel/fs: Add the URB fence message
intel/fs: Emit URB fences when we have LSC
vulkan/shader_module: Fix the lifetime of temporary shader modules
v3dv: Use VK_DEFINE_*HANDLE_CASTS instead of rolling our own
st/texture: Dedent surface setup in CompressedTexSubImage
st/texture: Fall back to single-slice uploads in st_CompressedTexSubImage
Move a bunch of the CLC stuff from src/microsoft to common code
compiler/clc: Clean ups
compiler/clc: grab opencl-c.h from the system path by default
anv,iris,genxml: Use NumberOfBarriers on XeHP
vulkan/physical_device_features: Drop some unnecessary dependencies
vulkan/physical_device_features: Stop generating a header
radv: Use VK_DEFINE_*HANDLE_CASTS instead of rolling our own
vulkan: Update the XML and headers to 1.2.195
anv: Add an anv_image_get_memory_requirements helper
intel/isl: Add a max_buffer_size limit to isl_device
intel/isl: Simplify isl_format_supports_filtering
intel/isl: Stop claiming ASTC works on Cherry View
anv: Ask ISL about ASTC support
intel/isl: ASTC support was removed on Gfx12.5
genxml: Drop bit 27 from RENDER_SURFACE_STATE::Surface Format
nir/algebraic: Lower fisfinite
nir/algebraic: Add some boolean optimizations
nir/algebraic: Add some opts for comparisons of comparisons
vulkan: Drop vk_object_base_reset
vulkan: Track which objects are client-visible
vulkan/log: Assert if the driver logs a client-invisible object
vulkan/log: Log to instance messages during instance construction
anv: drop a misplaced and wrong comment
anv: Stop printing descriptor pool allocation failures
anv: s/vk_error/anv_error/g
vulkan/log: Handle logging to a physical device
vulkan/log: Add common vk_error and vk_errorf helpers
anv: Drop unused logging helpers
anv/queue: Plumb the queue through all the queue_submit calls
anv: Use the common vk_error and vk_errorf helpers
radv: Stop printing descriptor pool allocation failures
radv: Switch to the new common vk_error helpers
lavapipe: Switch to the new vk_error helpers
panvk: Switch to the new vk_error helpers
v3dv: Switch to the new vk_error helpers
turnip: Plumb non-startup errors through the new vk_error helpers
vulkan/log: Drop _impl from the log helper names
vulkan/instance: Use vk_error in vk_instance_init
vulkan/device: Use vk_error
vulkan/device: Use vk_errorf to report missing features
Revert “mesa: use simple_mtx_t for TexMutex”
nir/lower_discard_or_demote: Fix metadata
vulkan: Generate flag #defines based on bitwidth
vulkan: Generate #defines with every bit in a given bitfield
anv: Use the common wrapper for GetPhysicalDeviceFormatProperties
anv: Flip around the way we reason about storage image lowering
meson: Add and use an idep for Vulkan WSI
vulkan/wsi: Add a dispatch table for WSI entrypoints
vulkan/wsi: Add common wrappers for most entrypoints
anv: Use the common WSI wrappers
radv: Use the common WSI wrappers
turnip: Use the common WSI wrappers
v3dv: Use the common WSI wrappers
panvk: Use the common WSI wrappers
lavapipe: Use the common WSI wrappers
venus: Use the common WSI wrappers
vulkan/wsi/common: Delete the wrapper entrypoints
vulkan/wsi/x11: Delete the wrapper entrypoints
vulkan/wsi/wayland: Delete the wrapper entrypoints
vulkan/wsi/display: Delete the wrapper entrypoints
vulkan/log: Tweak our handling of a couple error enums
i965: Emit a NULL surface for buffer textures with no buffer
lavapipe: Don’t wrap errors returned from vk_device_init in vk_error
anv: Fix FlushMappedMemoryRanges for odd mmap offsets
anv: Also disallow CCS_E for multi-LOD images
vulkan/util: Include stdlib.h
Jeremy Newton (1):
Fix building AMD MM/GL with EL7
Jesse Natalie (62):
mesa/main: Check for fbo attachments when importing EGL images to textures
microsoft/compiler: Implement texture loads from UAVs
microsoft/clc: Add a test for compiling a kernel with a read-write image
gallium/dri: Move driConf -> st option processing to aux/util
xmlconfig: Use static inline for regex fallback to prevent -O0 issues
wgl: Parse driconf options
wgl: Add a driver name for driconf
u_driconf: Use a macro to avoid repeating option names
CI: Update Windows quick_gl baseline for mysterious new passes
spirv2dxil: Fix build after spirv_to_dxil signature change
ci/windows: Build spirv-to-dxil
llvmpipe: Don’t wait for already-terminated threads on Windows
mapi: Fix shared-glapi build with MSVC
wgl: Fix unit test when using shared glapi
static-glapi: Fix MSVC preprocessor definitions
wgl: Don’t use BUILD_GL32 for wgl frontend
wgl: Move opengl32.def to target instead of frontend
wgl: Move wgl* non-extension definitions to libgl-gdi
wgl: Make overridden entrypoints local to stw_ext_context
wgl: Refactor drivers to a libgallium_wgl.dll
docs: Update Windows llvmpipe doc for driver split
gl.h: Remove dllimport
wgl: Create contexts and DHGLRCs separately
wgl: Pass share context as pointer instead of DHGLRC
wgl: Make contexts current with pointer instead of DHGLRC
wgl: Allow creating framebuffers that aren’t in the global window list
wgl: Make contexts current with framebuffers instead of HDCs
wgl: Split DrvReleaseContext to support unbind via pointer
wgl: Add iPixelFormat to stw_pixelformat_info
wgl: Un-inline helpers which use stw_own_mutex
wgl: Add an explicit iPixelFormat for context creation
wgl: Use HWND instead of HDC as primary framebuffer handle
wgl: Add a stw_dev getter
wgl: Swap buffers via pointer instead of HDC
wgl: Add stw_* DLL exports for EGL support
meson: Include EGL after gallium
meson, egl: Support building for the Windows platform
egl: Add wgl/gallium dependencies for Windows platform
egl: Use the .def file for Windows
egl: Don’t try to dereference native displays unless there’s a detectable platform
egl: Detect Windows platform using GDI
egl: Add a basic Windows driver
symbols-check: Fix symbol demangling for Windows
egl: Update Windows .def to include missing exports
meson: Set /Zc:__cplusplus for MSVC
CI/windows: Build shared-glapi, EGL, gles2
microsoft/clc: Rename compiler DLL to clon12compiler
microsoft/clc: Clean up clc_context
microsoft/clc: Stop heap-allocating tiny fixed-size transparent structs
microsoft/clc: Split clc_object and rename entrypoints
microsoft/clc: Support SPIR intermediates in the compilation APIs
microsoft/clc: Parse SPIR-V specialization consts into metadata
microsoft/clc: Support passing specialization consts to spirv_to_nir
microsoft/clc: Add API to independently specialize SPIR-V
microsoft/clc: Add a test for specializing via SPIRV-Tools
clover: std::result_of is deprecated in c++17 and removed in c++20
clover: Delete unused ‘e’ exception reference vars
clover: Rename module -> binary, because C++20 makes module a keyword
compiler/clc: Null extensions should mean all supported, not all
compiler/clc: Preserve OCL kernel arg type metadata on LLVM13
util/hash_table: Clear special 0/1 entries for u64 hash table too
d3d12: Fix Linux fence wait return value
Jonathan Marek (1):
freedreno/registers: add a6xx media formats
Jordan Justen (51):
nir: Add nir_lower_image() to lower cube image sizes
intel/compiler: Rename brw_nir_lower_image_load_store to brw_nir_lower_storage_image
intel/compiler: Lower cube image sizes using nir_lower_image()
intel/compiler: Remove cube array size lowering in compiler backend
meson: Search for python3 before python for bin/meson_get_version.py
meson: Check that bin/meson_get_version.py ran without an error
intel/pci-ids: Re-enable DG1 and add SG1
intel/compiler: Regroup TCS barrier code paths
intel/compiler: Add unified barrier support for TCS
iris: Disable the Y-tiled modifiers on XeHP+
intel: Move subslice_total into devinfo
intel/devinfo: Add devinfo->max_scratch_ids
intel/dev: Add is_dg2 to devinfo
intel/isl: Enable MOCS 61 for external surfaces on TGL
intel/dev: Add display_ver and set adl-p to 13
iris: Disable I915_FORMAT_MOD_Y_TILED_GEN12* on adl-p/display 13
Revert “iris: Disable I915_FORMAT_MOD_Y_TILED_GEN12* on adl-p/display 13”
Revert “intel/dev: Add display_ver and set adl-p to 13”
intel/dev: Add display_ver and set adl-p to 13
iris: Disable I915_FORMAT_MOD_Y_TILED_GEN12* on adl-p/display 13
intel/blorp: Move most of BLORP_CREATE_NIR_INPUT into a function
intel/blorp: Add compute support to BLORP_CREATE_NIR_INPUT
intel/blorp: Add shader_pipeline to brw_blorp_base_key
intel/blorp: Add brw_blorp_init_cs_prog_key
intel/compiler: Use INTEL_DEBUG=blorp to dump blorp compute shaders
intel/blorp: Add subgroup_id input for compute programs
intel/blorp: Add blorp_compile_cs
intel/blorp: Split out ps specific sampler state into a separate function
intel/blorp: Split out surface setup from state emission
blorp: Add blorp_alloc_general_state
intel/blorp: Emit compute program based on BLORP_BATCH_USE_COMPUTE
intel/gfx7: Change GPGPU Mode to bool
intel/blorp: Add blorp_get_cs_local_y, blorp_set_cs_dims
intel/blorp: Change discard terminology to bounds
intel/blorp: Add blorp_check_in_bounds()
intel/blorp: Use blorp_check_in_bounds for discards
blorp: Set view usage to ISL_SURF_USAGE_STORAGE_BIT for compute
blorp/clear: Simplify rbg-as-red channel packing
intel/blorp: Convert blorp_clear color_write_disable to a bitmask
intel/blorp: Support compute for slow clears
intel/blorp/blit: Rename wm_prog_key and prog_key to key
intel/blorp: Support some image/buffer blit operations using compute
anv: Store anv_queue_family type in cmd-pool
anv: Prevent starting a render pass on compute queues
anv/blorp: Make sure blorp type is supported by the queue
anv/blorp: Select pipeline based on BLORP_BATCH_USE_COMPUTE
anv/blorp: Add anv_blorp_batch_init, anv_blorp_batch_finish
anv/blorp: Force compute blorp on compute-only queues
anv/slice_hash: Don’t allocate more than once with multiple queues
intel/isl: Add mocs settings for DG2
Revert “iris: Disable I915_FORMAT_MOD_Y_TILED_GEN12* on adl-p/display 13”
Jose Maria Casanova Crespo (8):
Revert “ci: disable Broadcom CI”
v3d/driconf: Expose non-MSAA texture limits for mutter and gnome-shell
v3d: export supported prim types by v3d
v3d: remove primconvert
vc4: export supported prim types by vc4
vc4: remove primconvert
v3d: Enable PIPE_CAP_PRIMITIVE_RESTART
v3d: Enable PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE
Joshua Ashton (26):
lavapipe: Use common Vulkan format helpers
radv: Fix DCC image store check
radv: Disable DCC on storage images that cannot support DCC image stores
ac/surface: Add modifiers capable of DCC image stores
ac/surface: Add ac_modifier_supports_dcc_image_stores helper
radv: Expose modifiers that support DCC image stores with STORAGE_IMAGE_BIT
radv: Push box traversal results onto stack in correct order
radv: Add noatocdithering option to RADV_DEBUG
vulkan/util: Cast vk_alloc pointers
radv: Rename radv_subpass_barrier function to radv_emit_subpass_barrier
radv: Define extern “C” linkage if C++
ac/surface: Add helper for checking if a surface supports DCC Image stores
radv: Use common DCC image store check
radeonsi: Use common DCC image store check
radv: Remove assert in radv_rt_bind_tables
radv: Do not pass result to insert_traversal_aabb_case
radv: Implement build_node_to_addr for GFX8 and below
radv: Implement software emulation for intersect_ray
radv: Enable raytracing extensions on older generations
radv: Add force_emulate_rt perftest option
ac/surface: Use 64 && 128 for GFX10_3 on non-modifier path
ac/surface: Add ac_modifier_max_extent
radeonsi: Check if modifier supports the image extent
radv: Respect max extent for modifiers
ac/surface: Expose modifiers capable of DCC image stores first
radv: Do early and late tests for fast clears
Joshua Watt (1):
v3d, vc4: Fix dmabuf import for non-scanout buffers
José Fonseca (1):
llvmpipe: Add a linear rasterizer optimized for 2D rendering.
Juan A. Suarez Romero (35):
broadcom/compiler: emit TMU flush before a jump
ci/v3dv: update expected results
ci/v3d: add piglit flake test
v3d: handle debug options with debug_named_value
v3dv: assert job->cmd_buffer is valid
ci/v3dv: update vulkan expected results
broadcom: remove v3dv3 from neon library
ci: update to VK-GL-CTS 1.2.7.0
drm-uapi: add v3d performance counters
v3d: check if device supports performance monitors
v3d: attach performance monitor to jobs
v3d: move queries to pipe queries
v3d: add fence wait function
v3d: implement performance counter queries
v3d/simulator: implement performance counters
gallium/hud: initialize query
ci/v3dv: update expected results
broadcom/compiler: change current block on setting spill base
v3d: print error on perfmon destroy error
ci/vc4: update piglit expected results
broadcom/compiler: set current block on incrementing unifa
ci/v3dv: update flakes
v3dv: initialize CL submission structure
v3d/ci: add piglit flake
broadcom/ci: use deqp-runner suites for gles
broadcom/qpu: remove duplicated opcode variable
broadcom/compiler: check instruction belongs to current block
mesa: fix default texture buffer format
broadcom: make vir_emit_last_thrsw() private
broadcom/compiler: force a last thrsw for spilling
broadcom/compiler: add V3D_DEBUG_NO_LOOP_UNROLL debug option
broadcom: add cl_nobin debug option
ci/v3dv: update flakes
ci/v3d: add piglit flake
ci/vc4: add piglit timeout
Kai Wasserbäch (3):
gallivm: add new wrapper around Module::setOverrideStackAlignment()
gallivm: fix FTBFS on i386 with LLVM >= 13, StackAlignmentOverride is gone
fix(clover/llvm): update code to build with recent versions of LLVM 14 (Git)
Karol Herbst (4):
nv50/ir/nir: fix smem size for GL
nv30: fix emulated vertex index buffers
clover: Local memory needs to be aligned.
spirv: Don’t add 0.5 to array indicies for OpImageSampleExplicitLod
Keith Packard (1):
iris: Map scanout buffers WC instead of WB [v2]
Kenneth Graunke (29):
gallium: Remove dead pb_malloc_buffer_create function prototype
iris: Rename bo->gtt_offset to bo->address
iris: Improve the memory layout of iris_bo by fixing pahole issues
iris: Drop dead drm_ioctl prototype
iris: Don’t try to CPU read imported clear color BOs
iris: Use the new I915_USERPTR_PROBE API
iris: Allow SET_DOMAIN to fail when allocating new GEM objects
iris: Stop using SET_DOMAIN on discrete GPUs altogether
iris: Bypass the BO cache when allocating buffers for aux map tables
iris: Mark the aux table buffers with EXEC_OBJECT_CAPTURE.
i965: Only call lower_blend_equation_advanced for fragment shaders
glsl: Assert that lower_blend_equation_advanced is only called for FS
iris: Rewrite bo->index comment to refer to exec_bos[]
iris: Track written BOs via a bitfield rather than exec_object2 entries
iris: Defer construction of the validation (exec_object2) list
iris: Add some accessor wrappers for a few fields.
intel: Finish off the last scraps of bacon
iris: Move some iris_bo entries into a union
iris: Handle multiple BOs backed by the same GEM object in execbuf code
iris: Begin handling slab-allocated wrapper BOs in various places
iris: Introduce a BO_ALLOC_NO_SUBALLOC flag and set it in a few places
iris: Change the validation list debug code to print the BO list instead
iris: Move suballocated resources to a dedicated allocation on export
iris: Suballocate BO using the Gallium pb_slab mechanism
iris: Delete the MI_COPY_MEM_MEM resource_copy_region implementation.
iris: Require a 4K alignment for extra clear color BOs.
iris: Fix MOCS for buffer copies
iris: Fix parameters to iris_copy_region in reallocate_resource_inplace
intel/genxml: Fix MI_FLUSH_DW to actually specify the length properly
Kostiantyn Lazukin (1):
util/u_trace: Replace Flag with IntEnum to support python3.5
Kyle Brenneman (2):
Add copyright comments to the GLVND-related files.
Remove the shebang from eglFunctionList.py.
Leandro Ribeiro (8):
vulkan/wsi/wayland: check directly if we got globals successfully
vulkan/wsi/wayland: do not perform roundtrip when not querying formats
vulkan/wsi/wayland: fix crash when force_bgra8_unorm_first is true
vulkan/wsi/wayland: fold wsi_wl_display_swrast and wsi_wl_display_dmabuf into parent
vulkan/wsi/wayland: always initialize format vector
vulkan/wsi/wayland: add helper function find_format()
vulkan/wsi/wayland: create swapchain using vk_zalloc()
vulkan/wsi/wayland: memset members of image to zero
Leo Liu (8):
frontends/va: Add AV1 picture description
frontends/va: Add AV1 parameter buffers functions
frontends/va: Place AV1 picture and slice parameter buffers functions
frontends/va: Add AV1 profile main to the config
radeon/vcn: Enable the AV1 decode p010 mode
frontends/va: Reallocate p010 buffer for AV1 10 bits decode
radeon/vcn: reuse the dpb buffers when with the same size.
radeon/vcn: add a handling of error for incorrect reference lists
Lepton Wu (3):
gallium: Reset {d,r}Priv in dri_unbind_context
i965: Enable RGBX8888_SRGB format.
virgl: Add an option to disable coherent
Lionel Landwerlin (67):
isl: fix mapping of format->stringname
loader/dri3: create linear buffer with scanout support
nir/lower_shader_calls: adding missing stack offset alignment
anv: fix submission batching with perf queries
drm-shim: implement stat/fstat when xstat variants are not there
intel/disasm: fix missing oword index decoding
anv: don’t try to access Android swapchains
nir/lower_shader_calls: remove empty phis
anv/android: handle image bindings from gralloc buffers
genxml: add more INSTDONE registers for Gfx12.5
intel/error-decode: printout more registers
nir: prevent peephole from generating invalid NIR
intel/fs: fix framebuffer reads
microsoft/clc: small compile fix on Linux
clc: use the defined version for the parser
spirv: don’t fail on CapabilitySubgroupDispatch if supported
spirv: avoid shadowing local variable
spirv: workaround LLVM-SPIRV Undef variable initializers
spirv: don’t bother initializing variables to Undef
microsoft/clc: drop LLVM dependency to version < 12
nir: fix opt_memcpy src/dst mixup
spirv: switch Groups capability to non AMD specific field
microsoft/clc: drop MSVC specific function
microsoft/clc: fix compiler warning on uninitiailzed variable use
meson: extract libversion checks from clc & clover
anv: honor INTEL_DEBUG=sync
clc: add allowed extension for compile parameter
clc: print warnings/errors on their own line
clc: let user specify the targetted SPIRV version
anv: enable UBO indexing
intel/compiler: add missing line returns to logs
anv: remove redundant VertexURBEntryReadLength setting
nir/lower_io: preserve all metadata when no progress
anv: move GetBufferMemoryRequirement with other buffer functions
anv: implement vkGetDeviceBufferMemoryRequirementsKHR
anv: remove unused function
anv: move VkImage object allocation to anv_CreateImage
anv: implement vkGetDeviceImageMemoryRequirementsKHR
anv: implement vkGetDeviceImageSparseMemoryRequirementsKHR
anv: enable VK_KHR_maintenance4
vulkan: put generated defines into their own header
vulkan: handle new VK_KHR_synchronization2 image layouts
vulkan: remove unused VkCommand
vulkan/util: generate define for a selected few enums
vulkan: implement legacy entrypoints on top of VK_KHR_synchronization2
anv: add missing transition handling bits
anv: make semaphore helper work on a single object
anv: improve readability of pipelined states
anv: implement VK_KHR_synchronization2
spirv: deal with null pointers
anv: switch to use VkFormatFeatureFlags2KHR internally
intel/nir: allow unknown format in lowering of storage images
anv: start computing KHR_format_features2 flags for storage images
anv: implement VK_KHR_format_feature_flags2
anv: fill correct surface state for lowered storage image
isl: only bump the min row pitch for display when not specified
vulkan/wsi/wayland: don’t expose surface formats not fully supported
anv: fix push constant lowering with bindless shaders
intel/dev: fix HSW GT3 number of subslices in slice1
intel/dev: don’t forget to set max_eu_per_subslice in generated topology
intel/dev: reuse internal functions to set mask
intel/dev: fix subslice/eu total computations with some fused configurations
intel/perf: fix perf equation subslice mask generation for gfx12+
intel/devinfo: fix wrong offset computation
intel: remove 2 preproduction pci-id for ADLS
anv: don’t forget to add scratch buffer to BO list
anv: fix multiple wait/signal on same binary semaphore
Liviu Prodea (1):
ci: Add osmesa to Windows GitLab CI
Lone_Wolf (1):
clover: TargetRegistry.h was moved to another folder
Lucas Stach (2):
renderonly: don’t complain when GPU import fails
etnaviv: always try to create KMS side handles for imported resources
Luis Felipe Strano Moraes (2):
docs: Clean up environment variable docs for Intel drivers.
docs: Add documentation regarding INTEL_MEASURE to envvars doc.
M Henning (1):
nouveau: Support nir_intrinsic_*_atomic_fadd
Maniraj D (1):
egl: set TSD as NULL after deinit
Mao, Marc (1):
iris: declare padding for iris_vue_prog_key
Marcin Ślusarz (51):
intel/tools/aubinator_error_decode: tag hanging instruction
anv: share some code between vkCmdDrawIndirectCount and vkCmdDrawIndexedIndirectCount
glsl: evaluate switch expression once
nir/builder: invalidate metadata per function
intel/compiler: use nir_shader_instructions_pass in brw_nir_apply_attribute_workarounds
d3d12: use nir_metadata_none instead of its value
microsoft/clc: preserve only valid metadata in clc_lower_printf_base
microsoft/clc: use nir_shader_instructions_pass in clc_nir_dedupe_const_samplers
microsoft/compiler: preserve all metadata when upcast_phi doesn’t make progress
microsoft/compiler: use nir_shader_instructions_pass in dxil_nir_split_clip_cull_distance
microsoft/compiler: use nir_shader_instructions_pass in dxil_nir_lower_double_math
zink: use nir_shader_instructions_pass in lower_discard_if
zink: use nir_shader_instructions_pass in nir_lower_dynamic_bo_access
genxml: add INSTDONE_GEOM register for Gfx12.5
intel/error-decode: printout INSTDONE_GEOM register for Gfx12.5
glsl/opt_algebraic: disable invalid optimization
glsl: refactor code to avoid static analyzer noise
freedreno/ir3: use nir_metadata_none instead of its value
r600: use nir_shader_instructions_pass in r600_nir_lower_atomics
r600: preserve all metadata when passes don’t make progress
turnip: use nir_shader_instructions_pass in tu_lower_io
intel/compiler: INT DIV function does not support source modifiers
vulkan/wsi/x11: fix shm allocation control flow issue
glsl: propagate errors from *=, /=, +=, -= operators
glsl: break out early if compound assignment’s operand errored out
crocus: drop redundant unlikely’s around INTEL_DEBUG
intel/compiler: drop redundant likely’s around INTEL_DEBUG
anv: drop redundant unlikely’s around INTEL_DEBUG
lima: use nir_shader_instructions_pass in lima_nir_split_load_input
anv: Set graphics pipeline active_stages earlier
anv: Use input assembly state only when pipeline has vertex stage
intel/compiler: use nir_shader_instructions_pass in brw_nir_demote_sample_qualifiers
intel/compiler: use nir_shader_instructions_pass in brw_nir_clamp_image_1d_2d_array_sizes
intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_conversions
intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_mem_access_bit_sizes
intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_scoped_barriers
intel/compiler: use nir_shader_instructions_pass in brw_nir_lower_storage_image
intel/compiler: use nir_shader_instructions_pass in brw_nir_opt_peephole_ffma
intel/compiler: use nir_metadata_none instead of its value
anv: use nir_shader_instructions_pass in anv_nir_add_base_work_group_id
anv: use nir_shader_instructions_pass in anv_nir_lower_ycbcr_textures
anv: preserve all metadata when anv_nir_lower_multiview doesn’t make progress
glsl: preserve all metadata when lower_buffer_interface_derefs doesn’t make progress
nir: preserve all metadata when nir_lower_int_to_float doesn’t make progress
nir: preserve all metadata when nir_propagate_invariant doesn’t make progress
nir: preserve all metadata when nir_opt_vectorize doesn’t make progress
anv: allocate zeroed device object
nir/print: pad 64-bit constants with zeroes
anv: fix potential integer overflow
iris: fix scratch address patching for TESS_EVAL stage
intel: fix INTEL_DEBUG environment variable on 32-bit systems
Marek Olšák (211):
radeonsi: don’t expose no-attachment MSAA 16x on all 1 RB chips due to issues
radeonsi: document a missing synchronization for bindless textures
st/mesa: inline st_setup_arrays on MSVC too by adding a wrapper
mesa: remove unused drawid_offset parameter from DrawGalliumMultiMode
mesa: fix incorrect comment in draw_gallium_multimode
st/mesa: always use PIPE_USAGE_STAGING for GL_MAP_READ_BIT usage
shader_enums,mesa: move VERT_ATTRIB_EDGEFLAG to slot 31 for st/mesa
gallium: change pipe_vertex_element::src_format to uint8_t
gallium: add multi-component 64-bit UINT formats for raw double vertex attribs
gallium: add pipe_vertex_element::dual_slot to move lowering to CSO creation
gallium: lower raw 64-bit vertex formats in cso/vbuf instead of st/mesa
st/mesa: remove lowering of 64-bit vertex attribs to 32 bits
st/mesa: remove st_vertex_program::index_to_input
st/mesa: remove st_vertex_program::input_to_index
radeonsi: improve viewperf snx performance by forcing staging for VRAM buffers
gallium: simplify VRAM uploads by adding PIPE_RESOURCE_FLAG_DONT_MAP_DIRECTLY
gallium/noop: implement fences
gallium/noop: implement shader buffers and shader images
gallium/noop: use threaded_query
gallium/noop: use threaded_resource
gallium/noop: use threaded_transfer
gallium/noop: enable threaded_context to test TC overhead without a driver
gallium/noop: update pipe_screen::num_contexts
gallium/noop: implement a lot of missing screen functions
gallium/noop: implement a lot of missing context functions
radeonsi: allow arbitrary swizzle modes for displayable DCC
radv: allow arbitrary swizzle modes for displayable DCC
ac/surface: allow arbitrary swizzle modes for displayable DCC
gallium: add take_ownership into set_sampler_views to skip reference counting
st/mesa: set take_ownership = true in set_sampler_views
st/mesa: move handling CubeMapSeamless into st_convert_sampler where it belongs
gallium: remove vertices_per_patch, add pipe_context::set_patch_vertices
radeonsi: remove vertices_per_patch parameter from draw-related functions
frontend/dri: add environment variable DRI_NO_MSAA for performance comparisons
gallium: use a packed enum to make pipe_prim_mode 1-byte large with __GNUC__
gallium: change pipe_draw_info::mode to uint8_t on MSVC to make it 1 byte large
glthread: implement glGetUniformLocation without syncing
meson: add missing custom target to generate shader_replacement.h
mesa: add environment variable MESA_NO_SHADER_REPLACEMENT
util/cpu_detect: print num_L3_caches and num_cpu_mask_bits
util/cpu_detect: add/guess support for next Zen CPUs
vbo: merge draws with GL_LINES regardless of line stippling
vbo: check more GL errors when drawing via glCallList
mesa: remove unused indices parameter from validate functions
mesa: fix gl_DrawID with indirect multi draws using user indirect buffer
mesa: skip draw calls with unaligned indices
radeonsi: remove unused depth_clamp_any
radeonsi: remove instancing support from the prim discard compute shader
radeonsi: remove stages_key parameter from si_shader_selector_key
radeonsi: move si_vgt_stages_key determination into si_update_vgt_shader_config
radeonsi: move as_ls/es/ngg setting out of si_shader_selector_key
radeonsi: inline si_get_alpha_test_func
radeonsi: stop using AC_EXP_PARAM_UNDEFINED because it’s not useful
radeonsi: use memcmp and radeon_emit_array in radeon_opt_set_context_regn
radeonsi: correctly use cs instead of gfx_cs in build pm4 helpers
radeonsi: simplify memory usage checking by merging vram and gtt counters
radeonsi: inline remaining big functions in draw_vbo for better snx perf
radeonsi: simplify si_need_gfx_cs_space
winsys/amdgpu: clean up amdgpu_cs_check_space
radeonsi: inline si_need_gfx_cs_space
radeonsi: don’t use SQ_NON_EVENT before GE_PC_ALLOC for better perf on Navi1x
radeonsi: add si_print_current_ib function for debugging
ac/debug: add an option to disable colors for printed IBs
radeonsi: fix a memory leak in si_get_shader_binary_size
radeonsi: set gfx10 registers better in si_emit_initial_compute_regs
ac/gpu_info: fix detection of smart access memory
radeonsi: disable DCC stores on Navi12-14 for displayable DCC to fix corruption
radeonsi: enable DCC stores for clear_render_target on gfx10
radeonsi: add missing make_CB_shader_coherent for DCC stores into copy_image
radeonsi: handle pipe_aligned in compute_expand_fmask
radeonsi: rename DCC_WRITE -> ALLOW_DCC_STORE
radeonsi: track displayable_dcc_dirty for non-compute shaders
radeonsi: enable DCC stores on gfx10.3 APUs for better performance
radeonsi: clean up typecasts in compute_copy_image
ac/llvm: remove load_tess_coord callback
ac/llvm: implement a bunch of NIR AMD intrinsics for NGG
ac: remove needless parameters from ac_shader_abi::emit_outputs
ac: make ac_shader_abi::inputs an array instead of a pointer
ac/llvm: implement nir_intrinsic_overwrite_*_arguments_amd
ac/llvm: implement nir_intrinsic_elect
ac,radeonsi: load VS inputs at the call site of nir_intrinsic_load_input
ac,radv: remove unused inputs array and VS input code
radeonsi: don’t set prefer_mono for fetched instance divisors
radeonsi: ignore the vertex element count in si_shader_selector_key_vs
radeonsi: accurately check if instance divisors need a VS update
radeonsi: don’t update shaders if only the vertex element count changes
radeonsi: correct index_bias_varies usage
radeonsi: remove the primitive discard compute shader
winsys/amdgpu: precompute amdgpu_ib_max_submit_dwords
radeonsi: reduce the frequency of switching GS fast launch on/off
radeonsi: strengthen the VGT_FLUSH condition in begin_new_gfx_cs
radeonsi: skip setting some PGM_HI registers by switching to 32-bit addresses
winsys/amdgpu: include CS ioctl overhead in RADEON_NOOP
radeonsi: enable shader-based prim culling with polygon mode
radeonsi: remove a few fields from si_state_rasterizer
radeonsi: don’t emit PA_SU_POLY_OFFSET_CLAMP if it has no effect
radeonsi: add AMD_DEBUG=ib to print IBs
radeonsi: don’t use NGG passthrough if culling is possible for better perf
radeonsi: fix DCC image stores with display DCC
radeonsi: copy a few nir_shader_compiler_options from RADV
driconf: remove leftover code for allow_incorrect_primitive_id
radeonsi: fix DCC image stores with image descriptors in user SGPRs
radeonsi: add const to the key parameter in si_shader_select_with_key
radeonsi: handle NO_OPT_VARIANT in si_shader_select_with_key
radeonsi: sink memsets and disable uniform inlining in si_shader_selector_key
radeonsi: move PS shader key code into a separate function
radeonsi: don’t memset mono and opt in si_update_ps_shader_key
radeonsi: don’t memset part in si_update_ps_shader_key
radeonsi: divide si_update_ps_shader_key into many separate functions
radeonsi: ignore blitter when computing the PS shader key
radeonsi: update most of the PS shader key in set & bind functions
radeonsi: clean up and clear VS shader key fields related to outputs
radeonsi: update the VS shader key in set & bind functions and remove memsets
radeonsi: rewrite inlinable uniform states for shader keys in si_context
radeonsi: move si_shader_io_get_unique_index calls out of si_get_vs_key_outputs
radeonsi: move PS inputs_read computation out of si_get_vs_key_outputs
radeonsi: unset SI_PREFETCH_* only when we unbind pm4 shader states
radeonsi: make si_update_shaders a C++ template in si_state_draw.cpp
radeonsi: optimize scratch buffer size updates using C++ template arguments
radeonsi: check flatshade and sprite_coord_enable for spi_map in bind_rs_state
radeonsi: move DB_SHADER_CONTROL update for PS out of si_update_shaders
radeonsi: move flat shading VRS enablement out of si_update_shaders
radeonsi: precompute si_vgt_stages_key for NGG in si_shader
radeonsi: deduplicate si_compiler_ctx_state initialization
radeonsi: determine num_vbos_in_user_sgprs from template arguments in draw_vbo
radeonsi: eliminate a not-found conditional for PrimID in si_get_ps_input_cntl
radeonsi: force flat for PrimID early in si_nir_scan_shader
radeonsi: restructure si_get_ps_input_cntl for future refactoring
radeonsi: interleave si_shader_info::input_* in memory for faster emit_spi_map
radeonsi: precompute num_interp for si_emit_spi_map
radeonsi: simplify si_emit_spi_map for back-face colors
radeonsi: inline si_get_ps_input_cntl because it has only one use
radeonsi: unroll loops in si_emit_spi_map using 33 C++ template instantiations
radeonsi: precompute more spi_map code
radeonsi: set prefer_mono outside of si_shader_selector_key
radeonsi: move setting most TCS shader key fields out of si_shader_selector_key
radeonsi: move setting one GS shader key field out of si_shader_selector_key
radeonsi: put si_pm4_state at the beginning of si_shader
radeonsi: eliminate redundant SPI_SHADER_PGM_RSRC3/4_GS register writes
radeonsi: convert gfx10_emit_ge_pc_alloc to radeon_opt_set_uconfig_reg
radeonsi: use a trick to extract and pack edgeflags using fewer instructions
radeonsi: don’t set edgeflags for TES and blit VS
radeonsi: fix incorrect comments about VGT_SHADER_STAGES_EN
radeonsi: enable NGG passthrough when LDS is used, document the real constraints
radeonsi: remove the unused cs parameter from radeon_emit
radeonsi: remove the unused cs parameter from radeon_emit_array
radeonsi: remove the unused cs parameter from radeon_set_(config|context)_reg
radeonsi: remove the unused cs parameter from radeon_set_sh_reg
radeonsi: remove the unused cs parameter from radeon_set_uconfig_reg
radeonsi: remove the unused cs parameter from remaining packet functions
ac/surface: use DCC compatible with image stores for < 4K resolutions
ac/surface: correct a comment about DCC image stores
radeonsi: fix a depth texturing performance regression on gfx6-7
radeonsi: change the units of oversub_pc_factor to integer multiples of 1/4
radeonsi: decrease vertex count threshold for shader culling to 128
radeonsi: set vs_uses_base_instance using C++ template arguments
radeonsi: use the optimal draw packet sequence for VGT_FLUSH
radeonsi: reduce NGG culling on/off transitions by keeping it enabled
radeonsi: clean prefer_mono for the blit VS
radeonsi: don’t check ngg_culling != 0 for fast launch because it’s tautology
ac/gpu_info: fix the comment for the NGG->legacy transition bug
radeonsi: strenthen the ngg->legacy hw workaround, fix fast launch hangs too
radeonsi: fix clearing index_size for NGG fast launch
radeonsi: disallow NGG fast launch on Navi1x because VGT_FLUSH makes it slower
ac/llvm: pass cull options into cull_bbox directly
radeonsi: always use the correct number of vertices in NGG shader code
radeonsi: add gfx10 helpers for determining whether edgeflags are enabled
ac/llvm: rename ac_cull_triangle -> ac_cull_primitive
radeonsi: implement shader-based culling for lines
radeonsi: don’t set DX10_DIAMOND_TEST_ENA for better performance
util: add util_popcnt_inline_asm
util: import u_debug_refcnt, u_hash_table, u_debug_describe from gallium
gallium/util: make pipe_vertex_buffer_reference safe for hashing dst
gallium: add pipe_vertex_state and draw_vertex_state for display lists
gallium/u_threaded: implement draw_vertex_state
gallium/trace: add pipe_vertex_state support
gallium/util: add util_vertex_state_cache for deduplicating the states
st/mesa: add ST_PIPELINE_RENDER_NO_VARRAYS, for future display list support
st/mesa: make setup_arrays more reusable for future display list support
mesa: use pipe_vertex_state in vbo and st/mesa for lower display list overhead
radeonsi: separate VBO descriptor code into a new function (for future work)
radeonsi: implement draw_vertex_state for lower display list overhead
ac/surface: don’t overwrite DCC settings for imported buffers
ac/surface: enable DCC image stores for all displayable DCC on gfx10.3
mesa: add missing unlock_texture into generate_texture_mipmap
util/slab: use simple_mtx_t
util/queue: use simple_mtx_t for finish_lock
gallium/pb_cache: use simple_mtx_t
gallium/pb_slab: use simple_mtx_t
mesa: use simple_mtx_t for TexMutex
mesa: use simple_mtx_t for ShaderIncludeMutex
gallium/u_threaded: fix draw_vertex_state with multi draws
radeonsi: fix a leak in draw_vertex_state if threaded_context is disabled
radeonsi: remove duplicate partial_count variable
radeonsi: add back a workaround for DCC MSAA on gfx9 due to conformance issues
radeonsi: remove GS fast launch
util,gallium: put count in pipe_resource & sampler_view on its own cache line
radeonsi: align pipe_resource & sampler_view allocations to a cache line
radeonsi: fix an out-of-bounds access in si_create_vertex_state
ac/surface: always use suboptimal display DCC with DRM <= 3.43.0
ac/surface: disallow display DCC for big resolutions
ac/surface: enable better display DCC for chips newer than Yellow Carp
radeonsi: simplify how VS_OUT_CCDIST is set
radeonsi: simplify write_psize code in si_get_vs_out_cntl
mesa: fix crashes in the no_error path of glUniform
st/mesa: don’t crash when draw indirect buffer has no storage
radeonsi: enable shader culling for indirect draws
radeonsi: print the border color error message only once
radeonsi: fix 2 issues with depth_cleared_level_mask
radeonsi: fix a typo preventing a fast depth-stencil clear
driconf: disallow 10-bit pbuffers for viewperf2020/maya due to X errors
Marek Vasut (2):
freedreno: a2xx: Handle samplerExternalOES like sampler2D
freedreno: Handle timeout == PIPE_TIMEOUT_INFINITE and rollover
Marijn Suijten (1):
freedreno: Enable Adreno 508, 509 and 512
Mark Janes (3):
anv: Use local memory for block pool BO
anv: Allocate workaround buffer in local memory if present
anv: warn if system memory is used
Martin Krastev (2):
svga: enable DRM mks-stats via hooking to the corresponding DRM ioctls
meson: introduce option vmware-mks-stats controlling the instrumentations of gallium svga driver
Martin Roukala (néé Peres) (1):
radv/ci: mark some tests as flaky on gfx9
Matt Turner (5):
tu: Raise maxDescriptorSetUpdateAfterBindUniformBuffersDynamic to 16
util: Add unit tests for dag
util: Replace recursive DFS with iterative implementation
tu: Free device->bo_idx and device->bo_list on init failure
tu: Enable VK_KHR_uniform_buffer_standard_layout
Michael Tang (11):
spirv_to_dxil: expose version number
spirv_to_dxil: Run nir_lower_tex during compilation
microsoft/compiler: Add support for SV_SampleIndex intrinsic
microsoft/compiler: More robustly handle setting Register=-1
microsoft/compiler: Set the SampleFrequency runtime metadata
microsoft/compiler: Emit a flat interpolation method for SV_SampleIndex
microsoft/compiler: Miscellaneous fixes from running clang-format
microsoft/spirv_to_dxil: Add `install : true` to spirv_to_dxil library.
gallium/d3d12: move d3d12_lower_bool_input to microsoft/compiler
microsoft/spirv_to_dxil: use dxil_nir_lower_bool_input pass
microsoft/spirv_to_dxil: turn sysvals into input varyings
Michel Dänzer (2):
ci: Drop “success” job
ci: Put all container related jobs in a single stage
Michel Zou (6):
zink: Fix unused-variable warning
meson: dont use missing dumpbin path
radv: fix build with mingw
lavapipe: fix missing VKAPI_CALL attribute
wgl: fix 32 bits mingw exports
docs: mark off missing lavapipe exts
Mike Blumenkrantz (480):
zink: improve detection for broken drawids
lavapipe: increment drawid for multidraws
radv: merge si_write_viewport into radv_emit_viewport
radv: pre-calculate viewport transforms
radv: remove unused variable from radv_emit_viewport
lavapipe: don’t read line stipple info in pipeline creation if stipple is disabled
util/tc: make clear calls async
util/foz: stop crashing on destroy if prepare hasn’t been called
lavapipe: add a padding member to rendering_state
lavapipe: implement VK_EXT_color_write_enable
features: VK_EXT_color_write_enable for lavapipe
zink: check for dedicated allocation requirements during image alloc
zink: hook up VK_KHR_dedicated_allocation
zink: optimize shader recalc
zink: ifdef out some context prototypes/inlines for c++ compile
zink: start adding C++ draw templates
zink: add draw template for dynamic state
zink: make descriptors_update hook return a bool if a flush occurred
zink: if descriptor updating flushes, re-call draw/compute
zink: add template for starting new cmdbuf
zink: split pipeline_changed to use template value separately
zink: stop flagging pipeline dirty for line width changes
zink: don’t rebind vertex buffers if pipeline changes
zink: add a ctx flag for drawid reading
zink: flatten descriptor_refs_dirty into BATCH_CHANGED template
zink: use drawid_offset directly during draw
zink: add a ctx flag for shader reading basevertex
zink: remove screen info stuff from draw templates
zink: add changed flag for blend states
util/tc: add a util function for setting bytes_mapped_limit
radeonsi: use new tc util for setting bytes_mapped_limit
zink: use new tc util for setting bytes_mapped_limit
freedreno: use new tc util for setting bytes_mapped_limit
nir/lower_point_size_mov: zero nir_state_slot::swizzle in new variable
gallium: add pipe_sampler_state::pad member
lavapipe: add support for anisotropic texturing
nir: add nir_imm_ivec3 builder
zink: add mechanism for generating VkBuffers for rebinding
zink: change vbo_bind_count to a mask of slots
zink: handle vertex buffer offset overflows
zink: split and move maybe_flush_or_stall mechanic
zink: split draw_count checking to local variable
zink: make zink_end_render_pass public
zink: make batch_rp and norp static inlines
zink: use a local var for draw mode during draw
zink: add a param to check_batch_completion for toggling lock-taking
zink: rework oom flushing
zink: move mem cache to sub-struct
zink: inline mem cache hash table
zink: split mem cache per type
zink: clamp descriptor allocation bucket sizing to defined limit
zink: add define for descriptor alloc clamping
zink: improve lazy descriptor pool handling
zink: fix cached descriptor allocation clamping
nir/validate: refactor validate_assert to have a return value
zink: use array size in spirv bo length calculations
zink: add screen function for checking usage completion
zink: force batch completion check on query result
zink: add some resource util functions for batch usage
zink: collapse a conditional in zink_batch_resource_usage_set()
zink: use resource batch usage helpers in invalidate_buffer()
zink: simplify some dumb code in invalidate_buffer
zink: use new resource batch usage utils for is_resource_busy
zink: replace some direct batch_usage calls with resource abstractions
zink: remove no longer used internal resource function
zink: more explicitly check shader stages during compile
zink: merge draw_count and compute_count, move to batch struct
zink: improve oom flushing
zink: EXT_vertex_input_dynamic_state
zink: change descriptor flushing to assert
zink: lower subgroup ballot instructions
zink: implement compiler handling for subgroup ballot builtins/intrinsics
zink: remove VK_EXT_shader_subgroup_ballot from device info
zink: export PIPE_CAP_TGSI_BALLOT
zink: add env var to disable timelines
ci: add another zink job with timelines disabled
zink: use dynamic line stipple
zink: use MAP_ONCE for qbo readback
zink: rework buffer mapping
mesa/st: break up st_GetTexSubImage
mesa/st: break up st_choose_matching_format()
mesa/st: enable calling st_choose_format() purely for translation
mesa/st: add format-finding capabilities to pbo get_dst_format()
st/texture: refactor get_src_format() to be more useful
zink: never use staging buffer for unsynchronized buffer maps
zink: force threadsafe mapping for query results when necessary
Revert “zink: simplify some dumb code in invalidate_buffer”
zink: simplify some dumb code in invalidate_buffer (v2)
lavapipe: rework queue to use u_queue
lavapipe: use consistent semaphore variable naming
lavapipe: implement timeline semaphores
features: mark off timelines for lavapipe
zink: add locking for zink_shader::programs
zink: sum available memory heaps instead of assigning
zink: simplify else clause for mem info gathering
nine: don’t memset sampler state during conversion
nine: set CSO_NO_USER_VERTEX_BUFFERS for main cso context
nine: optimize texture binds a bit
nine: split enabled/dummy texture binds into separate iterators
nine: update bound sampler mask directly during texture updates
nine: track bound sampler count to optimize unbinds
nine: enable tc
nir: add imm_vec3 to round these out
nine: init take_index_buffer_ownership for draws
nine: init more draw info members
zink: add a suballocator
zink: repack zink_resource_object struct
zink: stop zeroing structs during resource allocation
zink: split transfer_unmap for images and buffers
zink: split mem unmap logic for images and buffers
zink: make map_count useful for dedicated image allocations
zink: remove PIPE_MAP_ONCE from subdata
zink: rejigger PIPE_MAP_ONCE for internal qbo reads
zink: flake out some tests for now
zink: collapse ‘dedicated’ allocation into zink_bo
zink: remove duplicated zink_resource_object::mem member
zink: split out zink_transfer allocation
zink: split buffer and image map functions
zink: remove unused variable from image map
zink: break out transfer map destroy
zink: handle map failures more effectively
zink: enable compat contexts
zink: ci updates
nir/lower_vectorize_tess_levels: set num_components for vectorized loads
softpipe: fix ci rule ordering to avoid unnecessarily running jobs
zink: simplify get_descriptor_set_lazy params
zink: remove redundant asserts from lazy descriptor set populate
zink: remove repeated lazy batch dd casts
zink: flag the gfx pipeline dirty and unset pipeline shader module on shader change
zink: do compute shader change on bind
zink: clear current gfx/compute program upon unbinding its shaders
zink: clear out all ubo rebinds first if they exist
zink: make descriptor update functions return the updated resource
zink: split out buffer rebinds to helper functions
zink: add bind counts for so bindings
zink: count streamout rebinds when doing buffer rebinds
zink: rebind all buffers on replacement
zink: only force all buffer rebinds if rebinds exist on other contexts
zink: defer deletion of no-attachment framebuffers
zink: stop referencing framebuffers
nine: replace unnecessary dynamic-sized array with bitfield
zink: move void format detection function to zink_format
zink: make component mapping function a static inline
zink: make void swizzle clamping util public
zink: add better TODO note for surface swizzles
zink: fix program init flag
zink: fix pipeline caching
zink: verify program key sizes before checking for default variant
zink: return early when getting resource modifer if no modifier is used
zink: inline program cache structs
zink: track mask of bound gfx shader stages
zink: split gfx shader cache based on stages present
zink: avoid hashing shader stages multiple times for new gfx programs
zink: create compute programs on bind
zink: simplify a bitmask init
zink: stop using dirty_shader_stages for shader binds
zink: add some null checks for shader variant key generation
zink: set inlinable_uniforms_mask first when binding a shader
zink: only remove programs from hash tables on shader deletion if needed
zink: implement PIPE_QUERY_GPU_FINISHED
zink: always init bordercolor value for sampler
zink: require occlusionQueryPrecise for occlusion queries
zink: assert precise queries are occlusion queries
zink: declare ctx var during blend state bind
zink: remove attachment count from pipeline hash
zink: pass current program’s shader array, not ctx array
zink: remove extra unsetting of ctx->vertex_state_changed
zink: reorder gfx program/pipeline/descriptor binds if dynamic state is present
zink: init ctx->gfx_prim_mode to nonzero value to trigger pipeline changes
zink: use ctx gfx prim mode for draw comparisons
zink: remove query flush from memory barrier hook
zink: slim down streamout component of mem barrier hook
zink: batch mem barrier hooks
zink: use dynamic prim type
zink: consolidate pipeline hash tables
zink: no-op prim changes for pipeline recalc
zink: hook up VK_EXT_extended_dynamic_state2
zink: template for VK_EXT_extended_dynamic_state2
zink: bump dynamic pipeline state count
zink: set primitive restart with extended dynamic state2
zink: move dynamic state1 pipeline members into substruct
zink: move viewport count into dynamic state1 part of pipeline hash
zink: zero viewport and scissor count in pipeline with dynamic state1
zink: repack zink_rasterizer_hw_state
zink: add clip_halfz to rasterizer hw state
zink: steal a bit from rast_samples in pipeline state
zink: convert rasterizer pipeline components to bitfield
zink: repack zink_gfx_pipeline_state
zink: make zink_gfx_pipeline_state::vertices_per_patch a bitfield
zink: improve threadsafe qbo access
zink: move time query ending out to zink_end_query
zink: don’t try to sync previous timestamp query qbo values
zink: more effectively utilize batch_usage for query destruction
zink: avoid pulling in unused push descriptors for cached ubo0
zink: remove extra program ref from cached descriptor updates
freedreno: export supported primtypes
freedreno: remove primconvert
freedreno: ci updates
zink: only update inlinable constants when they change
zink: determine whether the gpu has a resizable BAR at startup
zink: implement PIPE_RESOURCE_FLAG_DONT_MAP_DIRECTLY when resizable bar not present
radv: use pool stride when copying single query results
radv: ignore dynamic line stipple if line stipple isn’t enabled
zink: free local shader nirs on program free
zink: use VK_WHOLE_SIZE for full-sized bufferviews
zink: explicitly end renderpass before running dispatch
zink: move alphaToOne warning to a dynamic warning
zink: add input attachment thingy for spirv builder
zink: emit fbfetch variables as ntv input attachments
zink: add a compiler pass to translate fbfetch -> input attachments
zink: refactor descriptor layout/template creation a little
zink: track fbfetch info on context, update as needed
zink: flag color attachment images as input attachments at creation
zink: add an input attachment to the gfx push set layout to handle fbfetch
zink: fix lazy descriptor deinit
zink: add an input attachment to the gfx push set layout to handle fbfetch
zink: update push descriptor set anytime fbfetch changes
zink: add a renderpass flag for input attachment layout handling
zink: enable fbfetch pipe cap
docs: mark off ES 3.2 for zink
zink: ci updates
zink: destroy shader modules on program free to avoid leaking
aux/cso: always restore states in atom order
gallium/cso: add unbind mask for cso restore
zink: directly pass resource pointer to descriptor state updates
zink: use tc rebind info for buffer replacements
zink: split out stalling from fence-waiting function
zink: remove refcounting from batch states
zink: ensure gfx shader module states are updated when doing a partial recalc
zink: create inner scanout object without scanout binds
zink: dynamic vertex input template
zink: don’t use dynamic vertex stride with dynamic vertex input
zink: incrementally hash gfx shader stages
zink: incrementally hash module variants in pipeline
zink: incrementally hash vertex state into pipeline hash
zink: incrementally hash all pipeline component hashes
zink: inline gfx pipeline hash table
zink: track compatible render passes
zink: use compatible renderpass state in pipeline hash
zink: clamp lazy pools to 500 descriptors and allocate more slowly
zink: remove ZINK_HEAP_HOST_VISIBLE_ANY
mesa/st: create new surfaces before destroying old ones when updating attachments
radv: just use UINT64_MAX when getting absolute timeout for that value
radv: add some asserts for descriptor updating
lavapipe: support EXT_primitive_topology_list_restart
docs: update features for lavapipe
lavapipe: unbreak imageless framebuffer
zink: move get_framebuffer() to zink_framebuffer.c
zink: store some image creation metadata to object struct
zink: store some surface metadata to struct during creation
zink: use imageless framebuffers
lavapipe: unbreak push descriptor templates
zink: add a piglit ci job for lazy descriptors
tgsi_to_nir: force int type for LAYER output
zink: hash blend state pointers on creation
zink: remove tcs shader keys
zink: move sample part of fs key to renderpass
zink: add pipeline state flag for determining if output type is points
zink: move point sprite rasterizer bits to unhashed pipeline state
zink: move drawid_broken to unhashed pipeline state
zink: always emit sample id 0 for non-msaa texel pointers in ntv
zink: fix PIPE_CAP_DRAW_PARAMETERS export
zink: add 8bit alu handling
zink: hook up 8/16bit storage exts
zink: lower 32_2x16_split pack/unpack instructions
zink: implement nir_op_pack_half_2x16_split
zink: handle 8/16bit ssbo storage
zink: handle bo struct types that are just a runtime array
zink: fix PIPE_SHADER_CAP_FP16_DERIVATIVES handling
zink: clamp query results to 500 per qbo on 32bit
util/primconvert: force restart rewrites if original primtype wasn’t supported
lavapipe: fix primitive restart with indexed indirect draws
zink: hook up VK_EXT_primitive_topology_list_restart
zink: use EXT_primitive_topology_list_restart where available
zink: use dispatch table for (almost) all vulkan calls
zink: fix some pipe caps for max instructions
mesa/st: use uint for instance_divisor instead of int
aux/trace: dump more pipe_vertex_element members
mesa: skip fallback draw call if no primitives are being drawn
aux/trace: use private refcounts for samplerviews
zink: reorganize cached descriptor updating a bit
zink: split out lazy set updating
zink: fall back to lazy descriptors if too many cache misses in a row
zink: add “nofallback” descriptor mode
zink: document ZINK_DESCRIPTORS env var
zink: ci updates
zink: move resource unrefs to flush thread
zink: remove batch params from renderpass functions
zink: remove batch params from resource copy functions
zink: remove unused barrier function
zink: remove batch params from barrier functions
zink: clamp instance divisors to max value
zink: add 8/16bit ubo handling
zink: export PIPE_SHADER_CAP_FP16_CONST_BUFFERS
zink: initialize zink_descriptor_layout_key::use_count on create
Revert “zink: ci updates”
zink: set vbo resource usage on bind
zink: add inline for checking whether a resource has any binds
zink: replace a couple checks for bind counts with new inline
zink: add some asserts for buffer replacement
zink: add a batch ref when replacing a buffer that has binds and usage
zink: move batch ref when possible during buffer replacement
zink: make a local screen var for buffer replace
zink: use better check for determining bufferview rebinds
zink: remove ZINK_RESOURCE_USAGE_STREAMOUT
zink: use bind_stages for pipeline barrier generation
zink: don’t generate more pipeline stages if vertex bit is already set
zink: use more accurate generation for buffer barrier pipeline stages
zink: remove bind_stages and bind_history from zink_resource
zink: remove zink_get_resource_for_descriptor()
zink: use descriptor info for ubo hashing
zink: fix ZINK_MAX_DESCRIPTORS_PER_TYPE to stop exploding the stack
zink: add function for decomposing vertex format to single component
zink: decompose vertex attribs into single components when not supported
zink: use smallest int type possible for decompose shader key
zink: hook up dmabuf ext
zink: add dmabuf modifier query hooks for screen
zink: hook up VK_EXT_queue_family_foreign
zink: split import and export fd handle types
zink: set a flag for dmabuf init
zink: handle image creation for dmabufs
zink: fix import pNext attachment during image creation
zink: use foreign queue import for dmabufs
zink: add dmabuf fd handling
zink: fix dmabuf cap export
zink: unconditionally support conditional rendering
zink: fix some return values
zink: add return values for resource usage unsetting
zink: move barrier info to resource object struct
zink: unset barrier info if resource object no longer has usage after reset
zink: unset src access in barriers if there’s no src pipeline stages
zink: assert surface geometry
zink: add a resource reference for bufferviews
zink: move surface and bufferview caches onto resources
zink: wrap framebuffer surfaces to preserve gallium expectations
zink: be smarter about fb surface rebinds
zink: force imageless fb rebind if rebinding an attachment
zink: update surface info when rebinding to storage
zink: add some debug asserts to validate imageless framebuffer correctness
compiler/spirv: add a fail if tex instr coord components aren’t dimensional enough
zink: don’t copy inner surface refcount
zink: stop setting nr_samples for null surfaces
zink: fix enabled vertex buffer mask calculation
zink: move pending prim type to gfx pipeline struct
zink: make tcs shader generation take screen param
zink: remove ctx references from shader compile path
zink: remove some ctx references from shader/pipeline compile
zink: only update gfx pipeline cache after creating a real pipeline
zink: simplify flagging last vertex stage for updating
zink: move xfb updates to just before draw
zink: move shader keys to be persistent on pipeline state
zink: move uniform size calc for shader keys into keybox
zink: store shader key to shader module
zink: stop using hash table for compute programs
zink: move shader cache to gfx program struct
zink: replace shader module hash table with a list
zink: remove default_variants storage in program struct
zink: split out inlined uniform shader variants into separate cache
zink: simplify shader variant update loop
zink: cap max shader variants with inlined uniforms
zink: store drm fd to screen
zink: unbreak dmabuf handling
zink: pre-filter multi-plane modifiers
zink: pass all modifiers through to image creation
zink: zero VkImageCreateInfo::queueFamilyIndexCount on creation
features: fix listing for GL_ARB_parallel_shader_compile
util/tc: rename tc_replace_buffer_storage_func::num_rebinds and document
zink: don’t leak drm fd on drmPrimeFDToHandle failure
zink: disable miplevel tests in ci completely for now
zink: fix regex syntax from previous ci commit
build: fix nine compilation with only zink enabled as a gallium driver
zink: always use type size for query result copy stride
zink: fix ci skips
zink: don’t use legacy scanout with modifiers
zink: clean up texture_barrier hook a little
zink: check for pending memory barrier before trying to flush it
zink: enable timeline ext features
zink: split vk debug logging into separate functions
zink: repack zink_render_pass_state
zink: add ZINK_HEAP_DEVICE_LOCAL_LAZY
zink: add ZINK_BIND_TRANSIENT
zink: improve handling of buffer rebinds using tc info
zink: reorder draw state updates
zink: remove fbfetch layout thingy from zs renderpass init
zink: move fb attachment init to new function
zink: stop setting nr_samples for shader image surface creation
zink: implement GL_EXT_multisampled_render_to_texture
docs: mark off GL_EXT_multisampled_render_to_texture for zink
zink: remove duplicated struct member set
zink: force lazy descriptor set rebinds if pipeline compatibility changes
zink: split out bvci creation from object creation
zink: don’t add resource to pending barrier set if no barrier will be generated
zink: refactor some shader image code to make it reusable
zink: handle bindless images and samplers in ntv
zink: hook up VK_EXT_descriptor_indexing
zink: implement bindless textures
zink: export PIPE_CAP_BINDLESS_TEXTURE
features: mark off bindless texture for zink
lavapipe: add support for KHR_shader_float_controls
anv: assert that legacy_scanout isn’t used with explicit modifiers
wsi/x11: fix uninit value by using zalloc for swapchain
zink: make a local resource var in fb_clears_apply_internal
zink: break out surface info init to helper function
anv: support EXT_primitive_topology_list_restart
zink: stop using VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT
zink: ensure fences are released before reusing them
zink: support 16bit rgbx formats
ci: updates
lavapipe: inherit from vk_image
lavapipe: EXT_4444_formats support
lavapipe: remove display extension support
build: unify vulkan cpp platform args
build: also remove wayland wsi flags from c++ build
features: be explicit about EXT_color_buffer_half_float support
zink: wait on thread queue before destroying context
zink: split out fb state updating to helper function
zink: wait in the flush thread when ETOOMANY batches are out
zink: move semaphore reset handling to submit
zink: remove zink_context::curr_batch
zink: stop leaking buffers on replacement
zink: switch remaining direct access of zink_resource_object::(reads|writes) to util
zink: remove reads/writes members from zink_resource_object
zink: stop leaking resource surface cache hash tables
zink: rework in-use batch states hash table to be a singly-linked list
zink: ci updates
zink: move glx@glx-multi-window-single-context to flakes
radv: don’t use invalid stride for triggering vertex state change
radv: dynamically calculate misaligned_mask for dynamic vertex input
radv: pre-calc “simple” dynamic vertex input values
radv: add a mask of bound descriptor buffers for dynamic vertex input
radv: move alpha_adjust into conditional during vertex input updating
aux/pb: add a tolerance for reclaim failure
aux/pb: more correctly check number of reclaims
zink: use static array for detecting VK_TIME_DOMAIN_DEVICE_EXT
zink: add a read barrier for indirect dispatch
zink: fully zero surface creation struct
zink: rescue surfaces/bufferviews for cache hits during deletion
zink: clear descriptor refs on buffer replacement
zink: assert compute descriptor key is valid before hashing it
zink: don’t update lazy descriptor states in hybrid mode
zink: move push descriptor updating into lazy-only codepath
zink: add an early return for zink_descriptors_update_lazy_masked()
zink: move last of lazy descriptor state updating back to lazy-only code
zink: detect prim type more accurately for tess/gs lines
zink: don’t break early when applying fb clears
zink: only reset zink_resource::so_valid on buffer rebind
zink: don’t check rebind count outside of buffer/image rebind function
zink: stop exporting PIPE_SHADER_CAP_FP16_DERIVATIVES
zink: don’t add dynamic vertex pipeline states if no attribs are used
zink: fix gl_SampleMaskIn spirv generation
zink: more accurately update samplemask for fs shader keys
nir/lower_samplers_as_deref: rewrite more image intrinsics
zink: add better handling for CUBE_COMPATIBLE bit
zink: use align64 for allocation sizes
zink: set aspectMask for renderpass2 VkAttachmentReference2 structs
zink: always use explicit lod for texture() when legal in non-fragment stages
zink: be more permissive for injecting LOD into texture() instructions
zink: inject LOD for sampler version of OpImageQuerySize
zink: flag renderpass change when toggling fbfetch
zink: don’t clamp cube array surfacess to cubes
zink: don’t clamp 2D_ARRAY surfaces to 2D
zink: error when trying to allocate a bo larger than heap size
zink: clamp max buffer sizes to smallest buffer heap size
zink: explicitly enable VK_EXT_shader_subgroup_ballot
zink: add more int/float types to cast switching in ntv
zink: force float dest types on some alu results
zink: stop double printing validation messages
zink: add SpvCapabilityStorageImageMultisample for multisampled storage images
zink: reject all storage multisampling if the feature is unsupported
zink: add queue locking
build: add sha1_h to llvmpipe build
zink: set fbfetch state on lazy batch data when enabling it
zink: always use lazy (non-push) updating for fbfetch descriptors
zink: clamp PIPE_SHADER_CAP_MAX_INPUTS for xfb
aux/primconvert: handle singular incomplete restarts
zink: rework cached fbfetch descriptor fallback
aux/trace: fix vertex state tracing
zink: be more consistent about applying module hash for gfx pipeline
zink: update gfx pipeline shader module pointer even if the program is unchanged
zink: always add VK_IMAGE_CREATE_2D_ARRAY_COMPATIBLE_BIT for 3D images
Mykhailo Skorokhodov (3):
iris: Fix compute shader leak
iris: Add missed tile flush flag
Revert “iris: add tile cache flush to iris_copy_region”
Nanley Chery (41):
anv: Add genX(cmd_buffer_emit_gfx12_depth_wa)
iris: Add genX(emit_depth_state_workarounds)
iris: Update the clear value in cso_z->packets
iris: Emit clear_params as part of cso_z->packets
iris: Update clear_params only when HiZ is enabled
intel: Move the D16 workarounds out of ISL
iris: Use constants for emitting cso_z->packets
iris: Optimize genX(emit_depth_state_workarounds)
anv: Optimize genX(cmd_buffer_emit_gfx12_depth_wa)
intel: Use env_var_as_boolean for INTEL_NO_HW
intel: Parse INTEL_NO_HW for devinfo construction
intel/isl: Add msaa_layout param to isl_tiling_get_info
intel/isl: Define ISL_TILING_4/64 for XeHP
intel/isl: Update image alignments on XeHP
intel/isl: Size Tile64 surfaces with 4 dimensions
intel/isl: Drop extra assert on array_pitch_el_rows
intel/isl: Drop ISL_SURF_USAGE_DISPLAY_*_BIT
intel/isl: Use an allow-list in gfx6_filter_tiling
intel/isl: Update tiling filter functions for XeHP
intel: Support Tile4/64 in depth/stencil state
intel: Support Tile4/64 in surface states
intel/blorp: Fix faked RGB image alignment on XeHP
intel/blorp: Fix Gfx7 stencil surface state valign
intel/isl: Fix halign/valign of uncompressed views
intel/isl: Use a switch for HALIGN/VALIGN encoding
intel: Update surface states for XeHP alignments
intel: Add underscores to HALIGN and VALIGN enums
intel/isl: Disable I915_FORMAT_MOD_Y_TILED on XeHP+
iris: Disable tiled memcpy for Tile4
anv/image: Don’t assert that HiZ can be added
iris: Delete iris_resource_get_clear_color
iris: Support NULL aux BOs in fill_surface_state
iris: Split clear color and aux BO checks
iris: Simplify an iris_use_pinned_bo call
iris: Allow NULL aux BOs in aux-state functions
iris: Don’t add a clear color BO for MC_CCS
iris: Add and use get_num_planes
iris: Finish aux import in iris_resource_from_handle
anv: Allow HIZ_CCS_WT with subpass self-dependencies
anv: Tile cache flush for depth before fast clear
iris: Tile cache flush for depth before fast clear
Neha Bhende (4):
aux/draw: use nir_to_tgsi for draw shader in llvm path
svga/drm: use pb_usage_flags instead of pipe_map_flags in vmw_svga_winsys_buffer_map
auxiliary/indices: convert primitive type PIPE_PRIM_PATCHES
st: Fix 64-bit vertex attrib index for TGSI path
Neil Roberts (1):
v3d: Update prim_counts when prims generated query in flight without TF
Olivier Fourdan (1):
radeonsi: Check aux_context on si_destroy_screen()
Paulo Zanoni (10):
iris: mark the workaround_bo as asynchronous
iris: don’t bump the seqno for the workaround_bo
iris: assign bo->index to the aux map BOs too
iris: extract the code that adds BOs to the batch lists
iris: add the workaround_bo directly to the batch
iris: use add_bo_to_batch() when adding batch->bo
iris: syncobjs are now owned by bufmgr instead of screen
iris: give each screen of a bufmgr a unique ID
iris: switch to explicit busy tracking
iris: signal the syncobj after a failed batch
Pavel Asyutchenko (3):
vulkan/overlay: Fix violation of VUID-VkMappedMemoryRange-size-01389
llvmpipe: fix crash when doing FB fetch + gl_FragDepth write in one shader
lavapipe: Fix vkWaitForFences for initially-signalled fences
Philipp Zabel (3):
etnaviv: fix gbm_bo_get_handle_for_plane for multiplanar images
etnaviv: fix dirty bit check for baselod emission
etnaviv: add mov for direct depth store output from load input
Pierre Moreau (5):
clover: Do not advertise OpenCL x.y when unsupported
clover/spirv: Increase max amount of function args
clover/spirv: Properly size 3-component vector args
clover/api: Interleave details in dispatch table
clover/nir: Set constant buffer pointer size to host
Pierre-Eric Pelloux-Prayer (78):
mesa: fix bindless uniform samplers update
dlist: don’t handle unmerged draws as merged
mesa: move gl_program::is_arb_asm to shader_info
radeonsi: preserve derivatives after discards for ARB shaders
gallium/va: don’t use key=NULL in hash tables
amd/registers: fix fields conflict detection
dlist: upload vertices in compile_vertex_list
dlist: implement vertices deduplication
radeonsi: add a script to run piglit/glcts/deqp tests
radeonsi: add expected tests results for Navi10 GPU
st/pbo: only use x coord when reading a PIPE_TEXTURE_1D
st/pbo: set nir_tex_instr::is_array field
st/pbo: add a fast pbo download code-path
radeonsi: fix test script’s output
radeonsi: add -t option to the test script
radeonsi: don’t create an infinite number of variants
nir: add a pass to optimize “gl_FragDepth = gl_FragCoord.z” away
radeonsi/test: fix test script args handling
radeonsi/test: format radeonsi-run-test.py with black
radeonsi/test: allow to pass a filename as a test filter value
radeonsi/test: prettier output
radeonsi/test: add Sienna Cichlid expected results
vbo/dlist: simplify add_vertex function
vbo/dlist: apply start_offset after indices construction
vbo/dlist: move VAO update at the end
vbo/dlist: use buffer_in_ram_size
vbo/dlist: use a single buffer object
vbo/dlist: remove vbo_save_vertex_store::bufferobj
vbo/dlist: don’t store prim_store
vbo/dlist: use prim_store directly
vbo/dlist: realloc prims array instead of free/malloc
vbo/dlist: don’t force list compilation if out of prim space
vbo/dlist: remove vbo_save_context::buffer_ptr
vbo/dlist: reset vertex_store::used in reset_counters
vbo/dlist: remove vbo_save_context::buffer_map
vbo/dlist: realloc vertex stores
vbo/dlist: remove vbo_save_context::max_vert
vbo/dlist: limit allocation sizes
vbo/dlist: don’t force list compilation if out of vertex space
vbo/dlist: rework out of memory
vbo/dlist: fix max_index_count value
vbo/dlist: remove vbo_save_copied_vtx
vbo/dlist: remove vbo_save_context::vert_count
vbo/dlist: add documentation
vbo/dlist: remove unused functions
vbo/dlist: rework buffer sizes
vbo/dlist: rework primitive store handling
vbo/dlist: rework vertex_store management
vbo/dlist: fix indentation in vbo_save_api.c
vbo/dlist: reallocate the vertex buffer on vertex upgrade
Revert “ci/v3d: add piglit flake”
radeonsi/test: fix typo in the test script
radeonsi/test: update expected results
radeonsi/sqtt: export wave size and scratch size
radeonsi/sqtt: add si_se_is_disabled
radeonsi/test: don’t require a folder name
radeonsi/test: use -t for deqp tests
radeonsi/test: print default values in help
radeonsi/test: allow to specify a baseline folder
radeonsi/test: sanitize output_folder
radeonsi/test: add –gpu to select the GPU to test
radeonsi/test: add Raven expected results
radeonsi/test: add sanity checks
gallium: add PIPE_CAP_PREFER_BACK_BUFFER_REUSE
loader/dri3: avoid reusing the same back buffer with DRI_PRIME
radeonsi: disable PIPE_CAP_PREFER_BACK_BUFFER_REUSE
radeonsi: don’t clear G_028644_OFFSET
radeonsi: implement si_sdma_copy_image for gfx7+
radeonsi: add an async compute context
gallium: add a is_dri_blit_image bool to pipe_blit_info
radeonsi: make the DRI_PRIME dGPU -> iGPU copy async
radeonsi: use viewport offset in quant_mode determination
radeonsi: treat nir_intrinsic_load_constant as a VMEM operation
radeonsi/sdma: fix bogus assert
ac/surface: don’t validate DCC settings if DCC isn’t possible
vbo/dlist: free copied.buffer if no vertices were copied
mesa: always call _mesa_update_pixel
radeonsi/sqtt: fix shader stage values
Qiang Yu (20):
nir/inline_uniforms: add uniforms in condition atomically
nir/inline_uniforms: support vector uniform
nir/loop_analyze: move nir_is_supported_terminator_condition() to header
nir/loop_analyze: record induction variables for each loop
nir/loop_analyze: skip unsupported induction variable early
nir/inline_uniforms: support loop
egl/dri2: seperate EGLImage validate and lookup
gbm/dri: implement image lookup extension version 2
gallium/dri: add dri_screen egl image validate hooks
gallium/api: add validate_egl_image interface
mesa: add ValidateEGLImage driver callback
mesa: fix glthread deadlock when EGL multi thread shared context
nir/lower_io_to_vector: check centroid & sample when merge variable
nir/linker: pack varyings with different interpolation qualifier
radeonsi: enable nir option pack_varying_options
radeonsi: fix ps SI_PARAM_LINE_STIPPLE_TEX arg
loader/dri3: fix swap out of order when changing swap interval
mesa/st: delay nir spirv link
nir/linker: support uniform when optimizing varying
nir/linker: rename replace_constant_input to replace_varying_input_by_constant_load
Quantum (1):
main: allow all external textures for BindImageTexture
Rhys Perry (108):
aco: don’t create v_madmk_f32/v_madak_f32 from v_fma_legacy_f16
ac/llvm: implement v2f16 fsat
radv: set image_dim and image_array intrinsic indices
aco: use image_dim and image_array intrinsic indices
aco: calculate correct register demand for branch instructions
nir/algebraic: fix imod by negative power-of-two
nir/algebraic: don’t optimize umod/imod/irem if lower_bitops=true
nir/algebraic: add optimizations for imul(a, INT_MIN)
nir/search: don’t consider INT_MIN a negative power-of-two
nir/algebraic: improve irem by power-of-two optimization
nir/idiv_const: improve idiv(n, INT_MIN)
nir/idiv_const: optimize imod/irem
nir: fix signed overflow for iadd constant folding
nir/tests: add tests for umod/imod/irem optimizations
radv: enable DCC with signedness reinterpretation
nir: remove src/compiler/nir/nir_control_flow
nir: swap fadd operands in nir_atan()
spirv: swap fadd operands in build_asin() and matrix_multiply()
nir/algebraic: add various ffma optimizations
nir/algebraic: reassociate add chains for more MAD/FMA-friendly code
nir/algebraic: add is_used_once to dot product reassociation optimization
nir: add ffma creation helpers
nir: create ffma from builders more often
nir: lower fdot to ffma if lower_ffma=false
spirv: create ffma more often
nir,glsl_to_nir: use nir_fdot()
ci: update trace hashes
aco: fix validation of DPP v_cndmask_b32/v_addc_co_u32
aco: add can_use_DPP() and convert_to_DPP()
aco: move a bunch of helpers into aco_ir.h/aco_ir.cpp
aco: make optimize_postRA() work across blocks
aco: handle DPP in the optimizer
aco: combine DPP into VALU before RA
aco: combine DPP into VALU after RA
aco/tests: add tests for pre-RA DPP combining
aco/tests: add tests for post-RA DPP combining
aco: fix vectorized 16-bit load_input/load_interpolated_input
aco: remove label_extract if the extract is used by a non-VALU
aco/scheduler: allow moving down VMEM stores to below VMEM loads
nir/lower_io: use nir_vector_insert_imm()
radv: use nir_vector_insert_imm in lower_intrinsics
nir: consider push constant loads as always dynamically uniform
nir/gcm: pin some instructions which require uniform sources
aco: include utility in isel
aco: don’t constant propagate to DPP instructions
aco/tests: test copy propagation with DPP instructions
aco: remove DPP when applying constants/literals/sgprs
aco: don’t coalesce constant copies into non-power-of-two sizes
aco/spill: add temporary operands of exec phis to next_use_distances_end
nir: separate lower_add_sat
nir: add sdot_2x16 and udot_2x16 opcodes
spirv: use sdot_2x16 and udot_2x16 opcodes
ac/gpu_info: add has_accelerated_dot_product
ac/llvm: implement nir_op_pack_32_4x8
ac/llvm,radv: implement uadd_sat/iadd_sat
ac/llvm: implement udot_4x8/sdot_4x8/udot_2x16/sdot_2x16 opcodes
radv: refactor handling of nir_options
radv,aco: implement iadd_sat
aco: implement nir_op_pack_32_4x8
aco: implement udot_4x8/sdot_4x8/udot_2x16/sdot_2x16 opcodes
aco/ra: allow v1b operands with 16-bit instructions
radv: expose VK_KHR_shader_integer_dot_product
aco/ra: don’t use ds_write_b8_d16_hi/ds_write_b16_d16_hi on GFX8
nir: fix serialization of loop/if control
radv: fix pipeline caching with robust buffer access
aco: add RegClass::is_linear_vgpr helper
aco: add and use RegClass::resize helper
aco: rewrite print_reg_class()
aco: find a scratch register for sub-dword copies on GFX7 if scc is empty
aco: find scratch reg for sub-dword psuedo instructions which read sgprs
aco/tests: fix finish_ra_test()
aco/tests: add regalloc.scratch_sgpr.create_vector
aco: implement linear vgpr copies
aco: allow live-range splits of linear vgprs in top-level blocks
aco/nops: use up-to-date mask_size
aco/nops: create handle_raw_hazard_instr helper
aco/nops: add State
aco/nops: fix handle_raw_hazard_internal when visiting the current block
nir/algebraic: distribute fmul(fadd(a, b), c) when b and c are constants
aco/tests: add idep_amdgfxregs_h
nir: add nir_src_components_read()
nir/opt_if: add opt_if_rewrite_uniform_uses
radv: don’t require a GS copy shader to use the cache with NGG VS+GS
radv: workaround incorrect image format with World War Z
radv: move ngg culling determination earlier
nir: add _amd suffix to fragment_mask_fetch and fragment_fetch texops
nir/lower_tex: add lower_to_fragment_fetch_amd
radv: don’t create blit pipelines for multisampled 3D images
aco: return 0x76543210 for NULL FMASK fetch
ac/nir: return 0x76543210 for NULL FMASK fetch
aco: use correct dim for FMASK fetches
radv,aco: use lower_to_fragment_fetch
radv,aco: don’t include FMASK in the storage descriptor
ac/llvm: fix image_samples with null descriptors
radv/llvm: fix parameter index for layer exports
aco: fix vadd32() when b is neither a constant nor temporary
radv: add and use radv_vs_input_alpha_adjust
radv: add radv_translate_vertex_format()
radv: add radv_shader_variant_get_va and radv_find_shader_variant helpers
radv: add segregated fit shader memory allocator
radv: move VS specific input SGPRs first
radv: implement dynamic vertex input state using vertex shader prologs
radv: add pre-compiled vertex shader prologs for common states
aco: implement aco_compile_vs_prolog
aco: implement VS input loads with prologs
radv: implement VK_EXT_vertex_input_dynamic_state
radv: enable VK_EXT_vertex_input_dynamic_state
aco: consider pseudo-instructions reading exec in needs_exec_mask()
Rob Clark (81):
freedreno/registers: update dsi registers to support tpg
freedreno/a6xx: Add missing PC_CCU_INVALIDATE_x
driconfig: Add support for device specific config
driconf: Add force_gl_renderer override
freedreno: Support per-device driconf overrides
freedreno: Unleash the dragon!
freedreno: Move generated device table to .h
freedreno: Drop device_id
freedreno: Reduce use of screen->gpu_id
freedreno/ir3: Reduce use of compiler->gpu_id
freedreno/ir3/lower_io_offsets: Drop gpu_id param
freedreno/all: Introduce fd_dev_id
freedreno: Make chip_id 64b
freedreno: Device matching based on chip_id
freedreno: Use correct key for binning pass shader
freedreno: Add a680 support
freedreno/cffdec: Fix indentation
freedreno/cffdec: Fix gpuaddr comparision
freedreno/crashdec: Decode full RB in verbose mode
freedreno/crashdec: Quiet spammy print in query mode
freedreno/common: Fix comment typo
freedreno/a6xx: Set type for PC_HS_INPUT_SIZE
freedreno/a6xx: Register updates for a6xx gen3
freedreno/a6xx: Rast updates for a6xx gen3
freedreno/a6xx: Fix streamout with tess_use_shared
freedreno/a6xx: Updates for tess_use_shared
freedreno/a6xx: Register updates for a6xx gen4
freedreno/a6xx: Fix a6xx gen4 compute shaders
freedreno/ci: Add a status variable for CI farm
freedreno/ci: Take fd farm offline for moving day
freedreno/ci: Bring fd farm back online after move
clover: Don’t remove sampler/image uniforms
nir/lower_amul: Handle load/store_global
nir/lower_amul: Fix usage of nir_foreach_src()
freedreno/ir3: Update physical_successors after retargetting jumps
freedreno/ir3: Fix physical successors for break out of loop
freedreno/ir3: Fix double printing of branch suffix
freedreno/ir3: Validate physical successors
freedreno/ir3: Improve error msg for block level validation
freedreno/ir3: Update physical_predecessors for streamout block
freedreno: Remove unused function
freedreno: Cleanup primtypes/primtypes_mask
freedreno: Move a6xx specific screen init
freedreno/drm: Garbage collect unused bo_cache
freedreno/drm: Rename bo->flags to bo->reloc_flags
freedreno/drm: Consider allocation flags in bo-cache
freedreno/drm: Don’t return shared/control bo’s to cache
freedreno/drm: Add cached-coherent bo support
freedreno/drm: Use cached-coherent cmdstream buffers
freedreno/drm: Use cached-coherent for control bo
freedreno: Used cached coherent for staging resources
freedreno: Add perf warning for WC readback
freedreno/a6xx: Pre-bake SO-disable stateobj
freedreno/ir3: Fix sched debug msgs
freedreno/ir3: Cleanup liveness lifetime
freedreno/ir3: Fix generation check
freedreno/computerator/a4xx: Fix enum mismatch warning
freedreno: Add info->a6xx.has_shading_rate
turnip: Fix unitialized cs->device
turnip: Rast updates for a6xx gen4
turnip: Fix a6xx gen4 compute shaders
isaspec: Remove unused leftovers
isaspec: Fix comment
isaspec: Split encode_bitset() into it’s own template
isaspec: De-duplicate bitset encoding
freedreno: Get shader variant msgs in perf debug output
freedreno: Optimize no-op submits
freedreno: Fix some indentation
freedreno/ir3: Remove used unused
freedreno: Handle cso==NULL in bind_sampler_states
freedreno: Handle PIPE_FORMAT_NONE buffers
gallium/u_threaded: Get reset status without sync
freedreno: Disable TC syncs for get_device_reset_status()
zink: Disable TC syncs for get_device_reset_status()
Revert “freedreno: Fix autotune regression since batch-cache rework.”
Revert “freedreno: Remove dead fd_batch_reset().”
Revert “freedreno: Use a BO bitset for faster checks for resource referenced.”
Revert “freedreno: Remove the submit lock locking.”
Revert “freedreno: Move the batch cache to the context.”
gallium/u_threaded: Split out options struct
freedreno/drm: Move pipe unref after fence removal
Rohan Garg (7):
virgl: Add more meta data to cached resources
Revert “Revert “virgl: Cache depth and stencil buffers””
virgl: Enable caching for sampler views and render targets
i965: Take into account the offset when marking a valid data region
i965: Write a custom allocator for the intel memobj struct
ci: Fix a minor issue in prepare-artifacts.sh script
ci: Use FDO_DISTRIBUTION_TAG where possible
Roland Scheidegger (7):
llvmpipe/linear: don’t try to use tgsi analysis for nir shaders
llvmpipe: always use draw_regions intersection
llvmpipe: fix nir dot products (fsum op)
aux/cso: try harder to keep cso state in sync on cso context unbind
gallium: add rasterizer depth_clamp enable bit
lavapipe: implement VK_EXT_depth_clip_enable
lavapipe: Fix crashes with transform feedback when using VK_WHOLE_SIZE
Roman Stratiienko (7):
kmsro: Add ‘kirin’ driver support
AOSP: Extract version from libdrm instead of hardcoding it.
AOSP: Upgrade libLLVM dependency to v12
AOSP: Update timestamps of target binaries
AOSP: Add panfrost vulkan library suffix
lima: Implement lima_resource_get_param() callback
meson_options: Bump max value of platform-sdk-version to 31
Ryan Neph (1):
virgl: disallow null-terminated debug messages
Sagar Ghuge (19):
nir: Add new opcode for ternary addition
intel/compiler: Add support for ternary add instruction on XeHP
intel/compiler: Make decision based on source type instead of opcode
intel/compiler: Allow ternary add to promote source to immediate
nir: Add optimizations for iadd3
intel/compiler: Enable has_iadd3 option on XeHP
intel/compiler: Fix missing break in switch
intel/compiler: Handle ternary add in lower_simd_width
genxml/gen12: Update debug register fields according to HW
genxml/gen125: Update debug register fields according to HW
anv: Fix VK_EXT_memory_budget to consider VRAM if available
intel/compiler: Add 64-bit A64 float logical opcode support
anv: Advertise support for shaderBufferFloat64AtomicMinMax
intel/compiler: Add support to handle 64-bit atomics with A32 messages
anv: No need to lower to A64 messages for 64-bit atomics
iris: Enable atomic operations on compressed surfaces
intel/genxml: Add new bit fields Render Compression Format
isl: Add helper to return render compression format encoding
isl: Use software programmable render compression format encoding
Samuel Pitoiset (215):
radv: only init the TC-compat ZRANGE metadata for the depth aspect
radv: fix bounds checking for zero vertex stride on GFX6-7
radv: report APUs as discrete GPUs for Red Dead Redemption 2
radv: fix specifying the stencil layout for separate depth/stencil layouts
radv: allow unused VkSpecializationMapEntries
aco: implement VK_EXT_shader_atomic_float2
radv: implement VK_EXT_shader_atomic_float2
radv: reduce number of emitted DWORDS for contiguous context registers
radv: do not use radeon_set_context_reg_seq() for only one register
radv: init radv_image::l2_coherent when creating the layout
ac: introduce a structure to store DCC address equations for GFX9
amd/addrlib: expose CMASK address equations to drivers on GFX9
ac/surface: add tests for CmaskAddrFromCoord prototype outside of addrlib
ac/surface: store CMASK pitch and height to radeon_surf
ac/surface: copy the CMASK equation to radeon_surf
ac/surface: implement CmaskAddrFromCoord in NIR
radv: fix selecting the first active CU when profiling with SQTT
radv: fix missing cache flushes when clearing HTILE levels on GFX10+
amd/addrlib: expose CMASK address equations to drivers on GFX10+
ac/surface: add tests for CmaskAddrFromCoord on GFX10+
ac/surface: implement CmaskAddrFromCoord in NIR on GFX10+
radv: rework DCC, FMASK and FCE decompress path
radv: perform a FCE for MSAA images that might have been fast-cleared
radv: allow DCC MSAA fast clears if a FCE is needed
radv: fix initializing the DS clear metadata value for separate aspects
radv: remove unnecessary FIXME about custom sample locations
radv: flush caches before performing separate depth/stencil aspect init
radv: bump maxFragmentSizeAspectRatio to 2
radv: disable fragmentShadingRateWithCustomSampleLocations
radv: bump maxFragmentShadingRateCoverageSamples to 32
radv: fix reported sample counts for VRS 1x1
radv: use more explicit DCC clear codes
radv: pass an image view to vi_get_fast_clear_parameters()
radv: add RADV_DCC_CLEAR_SINGLE
radv: determine if an image support fast clears using comp-to-single
radv: implement DCC fast clears with comp-to-single
radv: skip FCE for images that are fast-cleared using comp-to-single
radv: enable DCC fast-clears with comp-to-single on GFX10+
radv: allow fast clears for concurrent images if comp-to-single is supported
radv: fix pre-computing viewport xform when setting new viewports
radv: fix fast clearing depth images with mips on GFX10+
radv: determine if an image support comp-to-single at creation time
radv: remove useless check about the FCE predicate offset
radv: do not allocate the FCE predicate for images that use comp-to-single
radv: remove unnecessary check in radv_layout_is_htile_compressed()
radv: remove incorrect comment about compressed writes to HTILE on GFX10+
radv: fix copying depth+stencil images on compute
radv: remove unused fast depth-stencil gfx clear path with expclear
radv: remove useless DISABLE_{ZMASK,SMEM}_EXPCLEAR_OPTIMIZATION state
radv: don’t use SQ_NON_EVENT before GE_PC_ALLOC for better perf on Navi1x
radv: allocate shaders to 32-bit address to skip PGM_HI
nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)
Revert “nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(b, -a)”
nir/opt_algebraic: optimize fmax(-fmin(b, a), b) -> fmax(fabs(b), -a)
ci: update the list of expected failures/skips for RADV
radv: allow storage images with VK_FORMAT_E5B9G9R9_UFLOAT_PACK32 on GFX10.3+
ci: update the list of skipped tests for Fiji/RADV
radv: remove outdated radv_finishme() in the HW resolve path
radv: remove useless check about number of samples in the HW resolve path
radv: remove unecessary radv_finishme() for invalid color formats
radv: disable DCC image stores on Navi12-14 for displayable DCC corruption
radv: do not load/store the clear value for comp-to-single images
radv: do not allocate a clear value for images that support comp-to-single
radv: add support for clearing multi layers with normal gfx clear path
vulkan: Update the XML and headers to 1.2.190
radv: advertise VK_EXT_primitive_topology_list_restart
ac/llvm: adjust assertion for nir_intrinsic_terminate
ac/llvm: fix huge alignment when loading from shared memory
radv/llvm: fix invalid IR when converting triangle strips to indices
radv: use radeon_set_sh_reg_seq() more for initial gfx/compute state
radv: call nir_lower_int64() for LLVM
radv: track if shader image 32-bit float atomics are enabled
radv: do not disable DCC for storage images if atomics aren’t enabled
vulkan: add common entrypoints for sparse image requirements/properties
radv: use common entrypoints for sparse image requirements/properties
radv: use common vkGetPhysicalDevice{Image}FormatProperties()
radv: use common vkGetDeviceQueue()
radv: use common vkBind{Buffer,Image}Memory()
radv: use common vkGet{Buffer,Image}MemoryRequirements()
radv: fix determining the maximum number of waves that can use scratch
radv: remove NGG streamout support in LLVM
radv: allow to conditionally read HTILE value when copying VRS rates
radv: optimize copying VRS rates to the global HTILE buffer
radv: pass the HTILE buffer to radv_copy_vrs_htile()
radv: optimize VRS when no depth stencil attachment is bound
radv/llvm: rework VS input loads and implement the callback
ac/llvm: fix build with LLVM 14
radv: add MSAA support to the comp-to-single fast clear path
radv: enable comp-to-single for MSAA images
radv: reduce SQTT traffic when instruction timing is disabled
radv/llvm: fix using Wave32
radv/llvm: fix vertex input fetches with 16-bit floats
ac/llvm: implement nir_intrinsic_image_deref_atomic_{fmin,fmax}
ac/llvm: implement nir_intrinsic_ssbo_atomic_{fmin,fmax}
ac/llvm: implement nir_intrinsic_shared_atomic_{fmin,fmax}
ac/llvm: implement nir_intrinsic_global_atomic_{fmin,fmax}
radv: advertise EXT_shader_atomic_float2 with LLVM 14+
radv/ci: add a list of expected failures for VanGogh
ac/rgp, radv: report scratch memory size for shaders
ac/rgp, radv: report wave size for shaders
radv: rename radv_decompress_depth_stencil()
radv: implement depth/stencil expand on compute
radv: add support for copying compressed depth/stencil images on compute
radv: keep depth/stencil images compressed for TRANSFER_DST on compute
radv: replicate THREAD_TRACE_CTRL config when stopping SQTT
radv: make the SQTT BO a resident buffer
radv: remove useless assertions in the SQTT path
radv: do not use a different disk cache key for LLVM
radv: do not store meta shaders to the default shader disk cache
radv: remove useless shader variant key copies for VS+TCS
radv: stop loading invocation ID for NGG vertex shaders
radv: remove unused radv_tcs_variant_key:primitive_mode
radv: stop using the shader keys for as_ls/as_es/as_ngg when possible
radv: remove useless as_ngg_passthrough init when lowering NGG in NIR
radv/llvm: stop using vs_common_out.as_ngg_passthrough
radv: add export_clip_dists for VS and TES to radv_shader_info
radv,aco: stop using vs_common_out.export_clip_dists
radv/llvm: stop using vs_common_out.export_prim_id
radv: store the topology instead of the output primitive type in the key
radv: store the CS subgroup size to radv_shader_info
radv: rework layout of radv_pipeline_key
radv: pass the pipeline key to the backend compilers
radv: cleanup uses of VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT
radv: remove unused radv_nir_compiler_options fields
radv: remove unnecessary vs_common_out.export_viewport_index
radv: remove unnecessary vs_common_out.export_layer_id
radv: remove unnecessary radv_shader_info:{vs,tes}.export_prim_id
radv: remove unnecessary init of outinfo.export_prim_id for GS
radv: remove vs_common_out:export_prim_id
radv: remove vs_common_out:export_clip_dists
radv: pass the pipeline key to the shader info pass
radv: use the pipeline key more when possible
radv: stop using vs_common_out.{as_es/as_ls/as_ngg*} shader keys
radv: remove radv_shader_variant_key completely
radv: fix missing features for BDA
radv: remove the LLVM stat about the number of private VGPRs
radv: fix adjusting the frag coord when RADV_FORCE_VRS is enabled
radv: fix selecting the hash when RADV_FORCE_VRS is enabled
radv: make sure to load the Primitive ID for VS+GS as NGG
radv: fix vk_object_base_init/finish for the internal pipeline cache
radv: fix vk_object_base_init/finish for internal buffer views
radv: fix vk_object_base_init/finish for the internal push descriptors
radv: fix vk_object_base_init/finish for internal image views
radv: fix vk_object_base_init/finish for internal buffers
radv: set export_clip_dists for the GS copy shader
radv: determine the VS output parameters in the shader info pass
radv: disable the DX10 diamond test for better line rasterization perf
radv: get the float controls execution mode from NIR for LLVM
radv: do not declare an extra user SGPR for sample positions and PS
radv: move ngg early prim export determination earlier
move: move ngg lds bytes determination earlier
radv: move ngg passthrough determination earlier
radv: remove unnecessary ac_nir_ngg_config output struct
radv: constify radv_shader_info for radv_lower_{io_to_mem,ngg}()
radv: move forcing discard to demote to the graphics pipeline key
radv: move forcing invariant geometry to the graphics pipeline key
radv: move forcing MRT output NaN fixup to the graphics pipeline key
radv: move forcing VRS rates to the graphics pipeline key
radv: move use of NGG to the graphics pipeline key
radv: remove redundant check of needs_multiview_view_index for PS
radv: remove useless loads_dynamic_offsets when emitting push constants
radv: determine the ES type (VS or TES) for GS earlier
ci: enable building RADV in debian-release
radv: fix vk_object_base_init/finish for push descriptors
radv: fix writing combined image/sampler descriptor
radv: fix vk_object_base_init/finish for internal device memory objects
radv/llvm: fix exporting VS parameters
radv: do not set TRAP_PRESENT(1) for fragment shaders
aco: fix load_barycentric_at_{offset,sample}
radv: declare the shader user locs from the shader arguments
radv: determine if a shader uses indirect descriptors from the SGPR loc
radv: determine if a shader loads push constants from the SGPR loc
radv: remove unnecessary radv_shader_info:base_inline_push_consts
radv: remove unnecessary radv_shader_info:num_inline_push_consts
radv: do not overwrite loads_push_constants when declaring shader args
radv: gather more information about PS in the shader info pass
radv,aco: compute and store the SPI PS input in radv_shader_info
aco: prevent using undeclared shader arguments for PS
radv,aco: remap PS inputs when declaring shader arguments
aco: constify radv_shader_{info,args}
radv: remove radv_pipeline::layout
radv: implement vkGetDeviceBufferMemoryRequirementsKHR()
radv: implement vkGetDeviceImageMemoryRequirementsKHR()
radv: implement vkGetDeviceImageSparseMemoryRequirementsKHR()
radv: advertise VK_KHR_maintenance4
radv: use nir_image_deref_{load,store} in the DCC retile compute path
radv: remove useless coordinate computation in the compute clear path
radv: remove few useless nir_channels() in meta shaders
radv: use get_global_ids() to compute coordinates in meta shaders
radv: use nir_ssa_undef() for unused image components in meta shaders
radv: move ac_shader_config to radv_shader_binary instead of legacy
radv: store the post-processed shader binary config to the cache
radv,aco: remove nir_intrinsic_load_layer_id
radv: remove no-op about the view index in the shader info pass
radv: rename needs_multiview_view_index to uses_view_index
radv: stop gathering output GS info for vertex shaders
aco: cleanup setup_vs_output_info()
radv: do not initialize is_ngg_passthrough for geometry shaders
radv: remove duplicated code about NGG passthrough determination
radv: switch to VK_FORMAT_FEATURE_2_XXX/VkFormatProperties3KHR
radv: implement VK_KHR_format_feature_flags2
aco: do not return an empty string when disassembly is not supported
radv: fix removing PSIZ when it’s not emitted by the last VGT stage
radv: fix OpImageQuerySamples with non-zero descriptor set
radv: do not remove PSIZ for streamout shaders
aco: fix invalid IR generated for b2f64 when the dest is a VGPR
aco: fix emitting stream outputs when the first component isn’t zero
aco: fix loading 64-bit inputs with fragment shaders
radv: re-emit prolog inputs when the nontrivial divisors state changed
radv: fix build errors with Android
aco: only load streamout buffers if streamout is enabled
radv: do not expose buffer features for depth/stencil formats
radv/sqtt: fix GPU hangs when capturing from the compute queue
radv: fix a sync issue on GFX9+ by clearing the upload BO fence
nir: fix constant expression of ibitfield_extract
Sergii Melikhov (2):
iris: Fix Null pointer dereferences
dri2: Fix Null pointer dereferences
Shmerl (1):
vulkan/overlay: don’t display histogram and range for device and format
Simon Ser (18):
EGL: sync headers with Khronos
egl: add support for EGL_EXT_device_drm_render_node
etnaviv: fix renderonly check in etna_resource_alloc
etnaviv: fail in get_handle(TYPE_KMS) without a scanout resource
freedreno: fail in get_handle(TYPE_KMS) without a scanout resource
panfrost: fail in get_handle(TYPE_KMS) without a scanout resource
lima: fail in get_handle(TYPE_KMS) without a scanout resource
vulkan/wsi/wayland: use drm_fourcc.h for formats
vulkan/wsi/wayland: drop support for wl_drm
vulkan/wsi/wayland: generalize modifier handling
etnaviv: add stride, offset and modifier to resource_get_param
panfrost: implement resource_get_param
vc4: implement resource_get_param
v3d: implement resource_get_param
vulkan/wsi/x11: add driconf option to not wait under Xwayland
gbm: consistently use the same name for BO flags
gbm: add gbm_{bo,surface}_create_with_modifiers2
gbm: assume USE_SCANOUT in create_with_modifiers
Simon Zeni (5):
gbm: add GBM_FORMAT_R16
i915: remove use of backtrace and backtrace_symbols
glapi/gl_gentable.py: drop call to backtrace on no op
util/u_debug_symbol: remove debug_symbol_name_glibc and execinfo dependency
meson: stop searching for execinfo
Stéphane Marchesin (1):
virgl: Flush context before waiting on fences
Tapani Pälli (22):
crocus: take a reference to memobj bo in crocus_resource_from_memobj
crocus: disable depth and d+s formats with memory objects
iris: handle depth-stencil import with a wrapper function
anv: disable aux for exportable images without modifiers
anv: allow stencil memory export
anv/android: fix build error due refactoring
mesa: fix timestamp enum with EXT_disjoint_timer_query
mesa: GL_ARB_ES3_2_compatibility GL compat profile support
anv: remove a format assert when setting up attachments
vulkan: provide common functions to check device features
anv: remove feature checks from device creation
radv: remove feature checks from device creation
turnip: remove feature checks from device creation
v3dv: remove feature checks from device creation
lavapipe: remove feature checks from device creation
panvk: remove feature checks from device creation
intel/blorp: fix a compile warning about uninitialized use
intel/isl: FXT1 support was removed on Gfx12.5
swrast: Fix another warning from gcc 11
anv/android: fix parameters given for vk_common_QueueSubmit
anv: use vk_object_zalloc for wsi fences created
iris: clear bos_written when resetting a batch
Thomas H.P. Andersen (1):
nine: Fix assert in tx_src_param
Thomas Wagner (6):
gallium: add utility and interface for memory fd allocations
llvmpipe: add support for EXT_memory_object(_fd)
lavapipe: add support for KHR_external_memory_fd
llvmpipe: enable EXT_memory_object(_fd)
lavapipe: enable KHR_external_memory_fd
util: use anonymous file for memory fd creation
Thong Thai (15):
gallium: add temporal layers cap enum
frontends/va: check number of temporal layers supported by encoder
gallium: update h264 struct to track temporal layers
radeon/vcn/enc: H.264 SVC encode
radeonsi: enable H.264 temporal encoding support for VCN
frontends/va: handle h264 num_temporal_layers for SVC encoding
gallium: change rate ctrl struct to array
r600: change rate ctrl struct to array
radeon/vce: change rate ctrl struct to array
radeon/vcn/enc: change to per-temporal layer rate control
frontends/omx: change rate ctrl struct to array
frontends/va: change to per-layer rate control
gallium/auxiliary/vl: Add additional deinterlace enum and tracking
gallium/util: add half texel offset param to util_compute_blit
frontends/va/postproc: Keep track of deinterlacing method being used
Timothy Arceri (20):
util: document that workaround also fixes Riptale
glsl: replace some C++ code with C
nir/gcm: be less destructive with instruction order
intel/compiler: call nir_opt_dead_cf() after we have finished all opts
intel/compiler: Use GCM in nir_optimize
util: add workaround for Full Bore
glsl: relax rule on varying matching for shaders older than 4.20
intel/compiler: make sure swizzle is applied to if condition
nir: add indirect loop unrolling to compiler options
nir: move nir_block_ends_in_break() to nir.h
nir: add heuristic for instructions in loops with GCM
nir: fix GCM when GVN enabled
glsl: fix variable scope for instructions inside case statements
mesa: fix mesa_problem() call in _mesa_program_state_flags()
glsl: fix variable scope for loop-expression
glsl: handle scope correctly when inlining loop expression
glsl: fix variable scope for do-while loops
util/cache: run basic cache tests on the single file cache
util/cache: test simple cache put and get between instances
mesa: fix buffer overrun in SavedObj texture obj array
Timur Kristóf (71):
radv: Use 128-sized vertex grouping for NGG shaders.
radv: Don’t compile NGG culling into shaders that write viewport index.
radv: Remove num_viewports from radv_skip_ngg_culling.
aco: Swap s_and operand order for ballot.
aco: Allow elect to take advantage of knowing when all lanes are active.
aco: Remove s_and with exec when all lanes are active.
radv: Use pre-computed viewport transform for NGG culling state.
aco: Fix how p_elect interacts with optimizations.
aco, nir, ac: Simplify sequence of getting initial NGG VS edge flags.
ac/nir: Use es_accepted variable after culling.
ac/nir: Use gs_accepted variable after culling.
ac/nir: Don’t count vertices and primitives in wave after culling.
nir, aco: Remove vertex and primitive count overwrite intrinsic.
ac/nir: Remove unhelpful nir_opt_cse from ac_nir_lower_ngg_nogs.
aco: Use Navi 10 empty NGG output workaround on NGG culling shaders.
radv: Don’t toggle PC oversubscription for NGG culling.
radv: Use ac_compute_late_alloc in radv_pipeline.
ac: Remove deprecated use_late_alloc field as nobody uses it anymore.
radv: Write RSRC2_GS for NGGC when pipeline is dirty but not emitted.
aco: Fix to_uniform_bool_instr when operands are not suitable.
radv, ac, aco: Use indices 0-2 of gs_vtx_offset argument array on GFX9+.
radeonsi: Change GS vertex offset arguments to use gs_vtx_offset array.
ac: Calculate workgroup sizes of HW stages that operate in workgroups.
radv: Calculate workgroup sizes in radv_pipeline.
radv: Remove superfluous workgroup size calculations.
aco: Use workgroup size from input shader info.
aco: Consider LDS usage by PS inputs in MaxWaves calculation.
aco: Consider maximum number of workgroups per CU/WGP on Navi.
aco: Emit zero for the derivatives of uniforms.
aco: Unset 16 and 24-bit flags from operands in apply_extract.
nir: Add unsigned upper bound for extract opcodes.
nir: Fix local_invocation_index upper bound for non-compute-like stages.
nir: Add comment to explain the sad_u8x4 opcode.
aco: Fix invalid usage of std::fill with std::array.
ac/nir/ngg: Delete unused struct.
ac/nir/nggc: Don’t stop applying reusable variables at prim export.
ac/nir/nggc: Only repack arguments that are needed.
ac/nir/nggc: Move gs_alloc_req up in NGG culling shaders.
aco: Use Builder reference in emit_copies_block.
aco: Skip code paths to emit copies when there are no copies.
aco/optimize_postRA: Use iterators instead of operator[] of std::array.
aco: Add some useful info to the README for debugging.
radv: Remove PSIZ output when it isn’t needed.
aco: Add ability to optimize v_lshl + v_sub into v_mad_i32_i24.
aco/isel: Fix emit_vop2_instruction to apply 16/24-bit flags properly.
ac/nir: Remove byte permute from prefix sum of the repack sequence.
ac/nir: Fix match_mask to work correctly for VS outputs.
nir: Exclude non-generic patch variables from get_variable_io_mask.
radv: Disable HW generated edge flags for NGG shaders.
ac/nir: Emit edge flag instructions conditionally.
radv/llvm: Don’t read edge flags anymore.
radv: Fix gs_vgpr_comp_cnt for NGG culling in vertex shaders.
ac/nir/nggc: Refactor save_reusable_variables.
ac/nir/nggc: Don’t reuse uniform values from divergent control flow.
radv: Select PC oversubscription rate based on number of PS params.
radv: Reduce NGG culling small draw threshold to 128.
aco: Allow p_extract to have different definition and operand sizes.
aco: Implement integer conversions using p_extract.
aco: Omit p_extract after ds_read with matching bit size.
aco: Don’t write m0 register for LDS instructions on GFX9+.
aco: Fix small primitive precision.
aco: Fix determining whether any culling is enabled.
radv: Don’t declare ngg_gs_state when there is no API GS.
radv: Enable NGG culling by default on GFX10.3, add nonggc debug flag.
ac/nir/cull: Accept NaN and +/- Inf in face culling.
ac/nir/nggc: Write undef to variables in non-repacked ES threads.
aco/optimizer: Skip SDWA on v_lshlrev when unnecessary in apply_extract.
drirc: Fix indentation.
drirc: Apply radv_invariant_geom workaround to Resident Evil Village.
drirc: Apply radv_invariant_geom workaround to World War Z games.
aco: Fix how p_is_helper interacts with optimizations.
Tomeu Vizoso (40):
panvk: Don’t try to update samplers if they are immutable
panvk: Start a new batch when the job index gets above the limit
panvk: Close batch when ending a command buffer
panvk: Move check for fragment requirement up to the draw
panvk: A pipeline might not be bound when the render pass is ended
panvk: Expose panvk_cmd_alloc_fb_desc and panvk_cmd_alloc_tls_desc
panvk: Implement vkCmdClearAttachments
docs/ci: Update http cache config to let Authorization headers pass through
freedreno/ci: Move rules for restricted jobs to test-source-dep.yml
ci: Update canvas_text trace
virgl/ci: Have LLVMPipe use more threads for rendering
virgl/ci: Rebalance concurrency
virgl/ci: Wait a bit before shutting the VM down
virgl/ci: Set NIR_VALIDATE=0 on the host
panfrost: Add padding to pan_blit_blend_shader_key
iris/ci: Add manual jobs for tracking performance
panvk: Initialize timestamp for disk cache
freedreno/ci: Correctly set freq governors to max
iris/ci: Correctly set freq governors to max
panvk/ci: Build-test panvk
ci: Ensure the DRM device is open
lavapipe: add xfails for whole of CTS
vulkan: Read len attribute of parameters to functions
vulkan: Generate code to place commands in a queue
vulkan: Generate entrypoints that enqueue commands
lavapipe: Use generated command queue code
lavapipe: Use c_msvc_compat_args
vulkan: Remove dependency on Python 3.9+
Revert “lavapipe: unbreak imageless framebuffer”
vulkan: Copy pNext structures when enqueuing commands
ci: Uprev piglit to 99be1b06ff36
ci: Stop adding link to tracie dashboard
panfrost/ci: Enable test runs on G72
panvk: Move CmdClear* impl to a separate file
panfrost/ci: Move CI files to src/panfrost
panfrost/ci: Test panvk on Mali G52
ci: Rebuild kernel with Amlogic KMS support
panfrost/ci: Run Piglit’s quick_gl tests on G52
ci: Add support for lazor Chromebooks
ci: Let manual LAVA jobs have a longer timeout than others
Tony Wasserka (24):
radv: Rename radv_shader_helper.h to radv_llvm_helper.h
aco: Separate LLVM/CLRX asm printers more cleanly
aco: Extend set of supported GPUs that can be disassembled with CLRX
radv: Build code which depends on LLVM only when enabled
radv: Disable shader disassembly when no disassembler is available
aco/tests: Assert that the requested IR is actually provided
aco/spill: Avoid unneeded copies when iterating over maps
aco: Use std::vector for the underlying container of std::stack
aco/spill: Remove unused container
aco/spill: Replace map[] with map::insert
aco/spill: Avoid copying next_use maps more often than needed
aco/spill: Persist memory allocations of local next use maps
aco/spill: Avoid destroying local next use maps over-eagerly
aco/spill: Replace vector<map> with vector<vector> for local_next_use
aco/spill: Prefer unordered_map over map for next use distances
aco/spill: Avoid copying current_spills when not needed
aco/spill: Reduce redundant std::map lookups
aco/spill: Replace an std::map to booleans with std::set
aco/spill: Store remat list in an std::unordered_map instead of std::map
aco/spill: Change worklist to a single integer
aco/spill: Reduce allocations in next_uses_per_block
aco/spill: Clarify use of long-lived references by adding const
aco/spill: Use unordered_map for spills_exit
aco/spill: Use std::unordered_map for spills_entry
Vadym Shovkoplias (3):
driconf, glsl: Add a vs_position_always_precise option
drirc: Set vs_position_always_precise for Assault Android Cactus
intel/fs: Fix a cmod prop bug when cmod is set to inst that doesn’t support it
Vasily Khoruzhick (2):
lima: handle fp16 vertex formats
lima: split_load_input: don’t split unaligned vec2
Veerabadhran Gopalakrishnan (2):
radeon/vcn: Add FW header flag to enable VP9 header parsing
gallium/va: Remove VP9 header parsing for secure playback
Vinson Lee (17):
nv50/ir: Initialize Value member id in constructor.
asahi: Move assignment after null check.
spirv_to_dxil: Fix missing-prototypes build error.
meson: Remove duplicate xvmc in build summary.
nir: Initialize evaluate_cube_face_index_amd dst.x.
zink: Remove unnecessary null checks.
nv50/ir: Add FlatteningPass constructor.
freedreno: Require C++17.
broadcom/compiler: Fix qpu.flags.muf typo.
glx: Fix unused-variable warning with macOS build.
draw/tess: Fix unused-function warning with draw-use-llvm=disabled.
nv50/ir: Add DeadCodeElim constructor.
pps: Avoid duplicate elements in with_datasources array.
freedreno: Add valgrind dependency.
anv: Fix assertion.
radv: Fix memory leak on error path.
virgl: Allocate qdws after virgl_init_context to avoid leak.
Witold Baryluk (2):
zink: Do not access just freed zink_batch_state
zink: Fully initialize VkBufferViewCreateInfo for hashing
Yevhenii Kharchenko (1):
iris: fix layer calculation for TEXTURE_3D ReadPixels() on mip-level>0
Yevhenii Kolesnikov (19):
glsl: Add operator for .length() method on implicitly-sized arrays
glsl: Properly handle .length() of an unsized array
vulkan: Add a common vk_command_buffer structure
anv: Use a common vk_command_buffer structure
radv: Use a common vk_command_buffer structure
turnip: Use a common vk_command_buffer structure
v3dv: Use a common vk_command_buffer structure
lavapipe: Use a common vk_command_buffer structure
vulkan: Add a common vk_queue structure
anv: Use a common vk_queue structure
radv: Use a common vk_queue structure
turnip: Use a common vk_queue structure
v3dv: Use a common vk_queue structure
lavapipe: Use a common vk_queue structure
vulkan: Implement VK_EXT_debug_utils
vulkan/enum_to_str: Add generator for VkObjectType to Vulkan Handle
vulkan: Add vk_asprintf and vk_vasprintf helpers
vulkan: Add convenience debug message helpers
anv: Switch to new debug message helpers
Yipeng Chen (Jasber) (1):
radeonsi: do not use staging texture for APU
Yiwei Zhang (24):
venus: cache ahb backed buffer memory type bits requirement
venus: fix all missing vn_object_base_fini
venus: scrub ignored fields of pipeline info when rasterization is disable
venus: refactor failure path for sets allocation
venus: add vn_descriptor_set_layout_init
venus: descriptor layout to track more binding infos
venus: layout to track variable descriptor count binding info
venus: descriptor pool to track pool state
venus: descriptor set to track descriptor count of last binding
venus: check descriptor allocations against pool resource
venus: conditionally enable async descriptor set allocation
venus: set maxMipLevels to 1 for ahb images
venus: renderer to check map size only when mappable
venus: workaround a blob_mem mappable size check issue
venus: suggest the proper sampler ycbcr model conversion based on format
docs: update vn extension list
venus: amend supported extensions list
venus: properly check and fill ahb buffer properties
util: fix sign comparison
radv/anv android: rename buffer usage camera mask
android_stub: update platform headers to include atrace
venus: update to latest venus-protocol to include tracing
dri_interface: remove obsolete interfaces
dri_interface: remove gl header
Yogesh Mohan Marimuthu (2):
radeonsi: remove redundant setting scratch_state atom dirty
radeonsi: set scratch_state dirty only if ctx->scratch_buffer allocated
Yogesh Mohanmarimuthu (1):
vulkan/device-select: select correct default device for xcb apiVersion 1.0
Zachary Michaels (1):
X11: Ensure that VK_SUBOPTIMAL_KHR propagates to user code
Zhu Yuliang (1):
gallium/vl: don’t leak fd in vl_dri3_screen_create
byte[] (1):
i965: Explicitly abort instead of exiting on batch failure
liuyujun (1):
gallium: fix surface->destroy use-after-free
mattvchandler (1):
gallium/osmesa: fix buffer resizing
mwezdeck (1):
mesa: validate texture format against GL/ES ctx
orbea (1):
build: add sha1_h for lp_texture.c
suijingfeng (4):
gallivm: add basic mips64 support and set mcpu to mips64r5 on ls3a4000
pass egl-symbols-check test on mips64el
gallivm: fix pass init order on mips64 with llvm 8
llvmpipe: correct the debug information printed with GALLIVM_PERF=nopt
xantares (1):
lavapipe: Fix 32bits windows build