Mesa 20.2.0 Release Notes / 2020-09-28

Mesa 20.2.0 is a new development release. People who are concerned with stability and reliability should stick with a previous release or wait for Mesa 20.2.1.

Mesa 20.2.0 implements the OpenGL 4.6 API, but the version reported by glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) / glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used. Some drivers don’t support all the features required in OpenGL 4.6. OpenGL 4.6 is only available if requested at context creation. Compatibility contexts may report a lower version depending on each driver.

Mesa 20.2.0 implements the Vulkan 1.2 API, but the version reported by the apiVersion property of the VkPhysicalDeviceProperties struct depends on the particular driver being used.

SHA256 checksum

63f0359575d558ef98dd78adffc0df4c66b76964ebf603b778b7004964191d30  mesa-20.2.0.tar.xz

New features

  • GL_ARB_compute_variable_group_size on Iris.

  • GL_ARB_gpu_shader5 on llvmpipe

  • GL_ARB_post_depth_coverage on llvmpipe

  • GLES 3.2 on llvmpipe

  • GL_EXT_shader_group_vote on GLES3.

  • GL_EXT_texture_shadow_lod on llvmpipe

  • VK_AMD_texture_gather_bias_lod on RADV.

  • VK_AMD_gpu_shader_half_float on RADV/ACO.

  • VK_AMD_gpu_shader_int16 on RADV/ACO.

  • VK_EXT_extended_dynamic_state on ANV and RADV.

  • VK_EXT_image_robustness on RADV.

  • VK_EXT_private_data on ANV and RADV.

  • VK_EXT_custom_border_color on ANV and RADV.

  • VK_EXT_pipeline_creation_cache_control on ANV and RADV.

  • VK_EXT_shader_demote_to_helper_invocation on RADV/LLVM.

  • VK_EXT_subgroup_size_control on RADV/ACO.

  • VK_GOOGLE_user_type on ANV and RADV.

  • VK_KHR_shader_subgroup_extended_types on RADV/ACO.

  • GL_ARB_gl_spirv on nvc0/nir.

  • GL_ARB_spirv_extensions on nvc0/nir.

  • RADV now uses ACO per default as backend

  • RADV_DEBUG=llvm option to enable LLVM backend for RADV

  • VK_EXT_image_robustness for ANV

  • VK_EXT_shader_atomic_float on ANV

  • VK_EXT_4444_formats on ANV and RADV.

  • VK_KHR_memory_model on RADV.

  • GL 4.5 on llvmpipe

  • EGL_KHR_swap_buffers_with_damage on X11 (DRI3)

Bug fixes

  • [Regression][Bisected][20.2][radeonsi] American Truck Simulator continually allocates memory until OOM

  • anv: dEQP-VK.robustness.robustness2.* failures on gen12

  • [RADV] Problems reading primitive ID in fragment shader after tessellation

  • Massive memory leak (at least AMD, others unknown)

  • Substance Painter 6.1.3 black glitches on Radeon RX570

  • vkCmdCopyImage broadcasts subsample 0 of MSAA src into all subsamples of dst on RADV

  • Crash in ruvd_end_frame when calling vaBeginPicture/vaEndPicture without rendering anything

  • X-Plane 11 Installer crashes on startup since glsl: declare gl_Layer/gl_ViewportIndex/gl_ViewportMask as vs builtins

  • Horizon Zero Dawn graphics corruption with with radv

  • Amber test opt_peel_loop_initial_if: Assertion failed

  • Dirt Rally: Flickering glitches on certain foliage since Mesa 20.1.0 caused by MSAA

  • [BRW] WRC 5 asserts with gallium nine and iris.

  • radv: Corruption in “The Surge 2”

  • [RADV] Detroit: Become Human Demo game lock-ups with RADV

  • Road Redemption certain graphic effects rendered white color

  • vulkan/wsi/x11: deadlock with Xwayland when compositor holds multiple buffers

  • [RADV/ACO] Death Stranding cause a GPU hung (ERROR Waiting for fences timed out!)

  • lp_bld_init.c:172:7: error: implicit declaration of function ‘LLVMAddConstantPropagationPass’; did you mean ‘LLVMAddCorrelatedValuePropagationPass’? [-Werror=implicit-function-declaration]

  • Intel Vulkan driver crash with alpha-to-coverage

  • EGL_KHR_swap_buffers_with_damage support on X11

  • radv: blitting 3D images with linear filter

  • [ACO] Compiling pipelines from RPCS3’s shader interpreter spins forever in ACO code

  • Intel Vulkan driver assertion with small xfb buffer

  • [spirv-fuzz] SPIR-V parsing failed “src->type->type == dest->type->type”

  • radeonsi: radeonsi crashes in Chrome on chromeos

  • [RADV] commit d19bc94e4eb94 broke gamescope with Navi

  • 4e3a7dcf6ee4946c46ae8b35e7883a49859ef6fb breaks Gamescope showing windows properly.

  • anv: crashes in CTS test dEQP-VK.subgroups.*.framebuffer.*_tess_eval

  • Intel Vuikan (anv) crash in copy_non_dynamic_state() when using validation layer

  • Mafia 3: Trees get rendered incorrectly

  • radv: dEQP-VK.synchronization.op.multi_queue.timeline_semaphore.write_clear_attachments_*_concurrent fail when forcing DCC.

  • Crash on GTA 5 through proton 5.0.9 and GE versions

  • Mesa 20.2.0-rc1 fails to build for AMD

  • Assertion failure compiling shader from Zigguart

  • Panfrost locks for waiting fence when running Source engine games

  • ci: -Dtools=panfrost should be build-tested

  • panfrost: Register allocation fails for Firefox WebRender shaders

  • VRAM leak with vuilkan external memory + opengl memory objects

  • [vulkan/build] Recent build system changes made VK_EXT_acquire_xlib_display unnecessarily depend on GBM

  • ci: Capture devcoredumps on chezas

  • Possible array out of bounds in brw_vec4_nir.cpp

  • freedreno/a6xx: incorrect rendering in asphalt 9

  • [tgl][bisected][regression][iris] failure on dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_default

  • Multiply defined symbols compiling with gcc@10.1.0

  • shrinking descriptor pool on intel+vulkan

  • dEQP-VK.renderpass2.dedicated_allocation.attachment.1.12 fails on NAVI14

  • turnip: binning and indirect dependency

  • Amber test leads to NIR validation failed after nir_opt_if (on spirv-fuzz shader)

  • Unable to compile mesa-git from b559d26c

  • Ambient light too bright with ACO in AC: Odyssey

  • Multiple issues with Detroit Become Human

  • ci: Capture artifacts in baremetal mode

  • turnip/ir3: fine derivatives

  • panfrost: regression: Major stuttering and low compositor FPS with glmark2

  • khr_debug-push-pop-group_gl: ../src/util/simple_mtx.h:86: simple_mtx_lock: Assertion `c != _SIMPLE_MTX_INVALID_VALUE’ failed.

  • freedreno/a6xx: skai/skqp fails

  • SPIR-V parsing fails in src/compiler/spirv/spirv_to_nir.c

  • SPIR-V parsing fails in src/compiler/spirv/vtn_cfg.c

  • Weird GLSL bug

  • iris driver is broken in Freedesktop 19.08

  • LLVM not properly shutdown in si_pipe.c?

  • Panfrost: add current status to docs/features.txt

  • Opengl incorrect rendering on yuzu Amd

  • RADV: VK_ACCESS_MEMORY_READ/WRITE_BIT is not implemented

  • [bisected][regression][all platforms] multiple deqp-gles31/glescts/piglit failures

  • 7406ea37, “ac/surface: require that gfx8 doesn’t have DCC in order to be displayable”, breaks Gamescope being able to launch games on RX580, and possibly other gfx8 cards

  • vkGetSemaphoreCounterValue doesn’t update without vkWaitSemaphores calls on Intel UHD 620

  • [RADV] System crash when playing XCOM Chimera Squad because of commit #7a5e6fd2

  • [RADV] Non-precise occlusion queries return non-zero when all fragments are discarded

  • [DXVK] Project Cars rendering problems

  • ADDRLIB ODR Violation

  • Build fails with current mesa from git “undefinierter Verweis auf »nir_lower_clip_disable«”

  • KDE Compositor stuttering after Check for window destruction in dri3_wait_for_event_locked

  • Add fallthrough to prevent errors caused by missing break

  • i965/20.1: gray rendering with torcs racing

  • glBindBufferRange call seems to be ignored by one of two shader-programs on radeon cards

  • [bisected][g33] piglit.spec.ext_framebuffer_object.fbo-cubemap failure

  • Increase GL_MAX_COMPUTE_SHADER_STORAGE_BLOCKS to greater value.

  • nir: st_nir_lower_builtin fails for gl_LightSource[i]

  • Sometimes VLC player process gets stuck in memory after closure if video output used is Auto or OpenGL

  • Double unlock in rbug_context.c

  • Double copy for TexSubImage

  • [v3d] corruption when GS omits some vertices

  • Iris crashes when reading from multisampled front buffer on platforms without front buffer

  • freedreno: subway surfers crash when repeatedly toggling fullscreen

  • [RADV/GFX8] Performance drop in DOOM Eternal when “Present from compute” is enabled

  • freedreno: multiple applications crash on a5xx

  • Use-after-free crash innv50_ir::GCRA::RIG_Node::init()

  • intel: Sample mask writes need to be honored in Vulkan

  • [RADV] - Path of Exile (238960) - Map outline, landscape and markers are missing with the Vulkan renderer.

  • ASTC texture decompression fails when using software fallback

  • [i965][iris][regression][bisected] multiple piglit and glcts failures on all platforms

  • please publish GPG keyring used to sign new releases

  • [BISECTED] compiling shader causes crash

  • Missing render Information on Stellaris

  • freedreno/ir3: allow copy-propagate from array

  • Zink + GALLIUM_HUD SIGSEGV

  • piglit spec@egl_ext_device_base@conformance fails LLVM 11 Git assertion since “llvmpipe/fs: add caching support”

  • llvmpipe: 1x1 framebuffer with a 2x2 viewport

  • [regression] nir build failure

  • ci: need to end baremetal tests after kernel panic/instaboot

  • If-statement body is executed for false condition

  • freedreno/a6xx: broken rendering in playcanvas “after the flood”

  • [regression] performance drop on Dota 2, CS:GO, and gfxbench GL benchmarks on ICL/Iris

  • [amd] C++ ODR violatation for union GB_ADDR_CONFIG

  • Zink reports incorrect amount of video memory

  • [RADV/LLVM]: void llvm::ICmpInst::AssertOK(): Assertion `getOperand(0)->getType() == getOperand(1)->getType() && “Both operands to ICmp instruction are not of the same type!”’ failed.

  • glsl-1.50-gs-max-output hangs on Navi10 + NGG

  • anv: Runs out of binding tables with PPSSPP during long runs

  • Segfault in Panfrost with waypipe

  • ci: Use rsync instead of rm -rf ; cp for baremetal rootfs

  • i965: Rendering problems replaying a trace of “Refunct” after mesa-20.1.0-rc1 release [bisected]

  • Panfrost (rk3399 NanoPi M4) hang/crash on playing video on Kodi/X11

  • gallium/winsys/radeon/drm fails assertion on 32bit

  • NIR validation failed after glsl to nir, before function inline, wrong {src,dst}->type ?

  • nir/spirv asin() function not precise enough

  • Mesa 20.0.7 / 20.1.0-rc4 regression, extremally long shader compilation time in NIR

  • Android build error after 689acc73

  • freedreno/a6xx: gpu hangs in google earth

  • Mesa-git build fails on Fedora Rawhide

  • Doom Eternal 1.1 performs very poorly on RADV

  • iris/i965: possible regression in 20.0.5 due to changes in buffer manager sharing across screens (firefox/mozilla#1634213)

  • iris/i965: possible regression in 20.0.5 due to changes in buffer manager sharing across screens (firefox/mozilla#1634213)

  • Incorrect _NetBSD__ macro inside execmem.c

  • Possible invalid sizeof in device.c

  • YUV FP16 lowering validation failing

  • GLSL compiler assertion is_float() failed in glsl/ir_validate.cpp, visit_leave on specific WebGL shader

  • [RADV] - Doom Eternal (782330) & Metro Exodus (412020) - Title requires ‘RADV_DEBUG=zerovram’ to eliminate colorful graphical aberrations.

  • [RADV] - Doom Eternal (782330) & Metro Exodus (412020) - Title requires ‘RADV_DEBUG=zerovram’ to eliminate colorful graphical aberrations.

  • mesa trunk master vulkan overlay-layer meson.build warning empty configuration_data() object

  • [meson] increase minimum required version

  • Kicad fails to render 3D PCB models.

  • freedreno: minetest: alpha channel issue on a6xx

  • Reproduceable i915 gpu hang Intel Iris Plus Graphics (Ice Lake 8x8 GT2)

  • 7 Days to Die - “Reflection Quality” setting broken, results in environment rendered black

  • glsl: regression affecting shader compilation time

  • freedreno: glamor issue with x11 desktops

  • finish converting from fnv1a to xxhash

  • Hang in iris_dri in kitty

  • Setting twice value to output_stream in radv_nir_to_llvm.c

  • Overwriting value of jit_tex->sample_stride in lp_setup.c

  • [AMDGPU][OpenGL] apitrace of kernel/firmware crash that requires a reboot

  • Flickering in Superposition benchmark

  • Double lock in fbobject.c

  • Possible typo in aco_insert_waitcnt.cpp

  • [bisected] Steam crashes when newest Iris built with LTO

  • Freeing null pointer inside radv_amdgpu_cs.c

  • Duplicated sub expression in radv_nir_to_llvm.c

  • i965/vec4: opt_cse_local cause the out of bound array access

  • NIR: Regression on shader using 8/16-bit integers

  • ACO: Compiler segfault on 8/16-bit integers.

  • lp_bld_intr.c:70:16: error: use of undeclared identifier ‘LLVMFixedVectorTypeKind’; did you mean ‘LLVMVectorTypeKind’?

  • recent seqno changes causing surfaceflinger crash

  • [radeonsi] [glthread] Crash with glthread enabled

  • Deadlock in anv_timelines_wait()

  • [gles3] supertuxkart: some textures are incorrect

  • post_version.py does not work with release candidates

  • post_version.py does not work with release candidates

  • radv regression on android

  • ogl: Set mesa_glthread=true as default on the RPCS3 emulator

  • [iris] android deqp dEQP-EGL.functional.robustness.negative_context#invalid_notification_strategy_enum fails

  • zink: conditional rendering

  • [RadeonSI] Glitches on VEGA8 + RX 560X after MR 4863

  • RadeonSI OpenGL broken for GFX8 after unify code for overriding offset

  • freedreno/turnip: Don’t request fragcoord components we don’t use

  • Make check fails in ANV

  • srcutilmeson.build:294:4: ERROR: Program or command ‘winepath’ not found or not executable

  • Please add Zink to features.txt

  • llvmpipe: assert triggers in LLVM

  • debug builds are massively broken on Windows

  • ci: Report flakes on IRC from baremetal tests

  • heavy glitches on amd ryzen 5 since version 20.x

  • zink asserts with 32-bit boolean

  • OpenGL: Surviving Mars black screen late-game (possible shader problem)

  • Kerbal Space Program (KSP) hangs entire Navi system

  • Dirt: Showdown bad performance and broken rendering with enabled advanced lightning

  • gravit & Firefox WebGL broken since 3dc2ccc14c0e035368fea6ae3cce8c481f3c4ad2 “ac/surface: replace RADEON_SURF_OPTIMIZE_FOR_SPACE with !FORCE_SWIZZLE_MODE”

  • mesa 20.0.5 causing kitty to crash

  • radeonsi: “Torchlight II” trace showing regression on mesa-20.0.6 [bisected]

  • [RADV/LLVM/ACO/Regression] After mesa commit a3dc7fffbb7be0f1b2ac478b16d3acc5662dff66 all games stucks at start

  • Android building error after commit 2ab45f41

  • freedreno/a6xx: pubg rendering glitches

  • iris: Crash when trying to capture window in OBS Studio

  • lp_test_format failure with llvm-11

Changes

Abhishek Kumar (1):

  • egl: Limit the EGL ver for android

Adam Jackson (1):

  • glx: Fix build and warnings with -Dglx=dri -Dglx-direct=false

Alejandro Piñeiro (9):

  • v3d/tex: only look up the 2nd texture gather offset for 1d non-arrays

  • v3d/tex: set up default values for Configuration Parameter 1 if possible

  • v3d/tex: use TMUSLOD register if possible

  • v3d: moving v3d simulator to src/broadcom

  • v3d/tex: handle correctly coordinates for cube/cubearrays images

  • vulkan/util: add struct vk_pipeline_cache_header

  • nir/lower_tex: handle query lod with nir_lower_tex_packing_16 at lower_tex_packing

  • v3d/packet: fix typo on Set InstanceID/PrimitiveID packet

  • v3d: set instance id to 0 at start of tile

Alyssa Rosenzweig (475):

  • pan/mdg: Track more types

  • pan/mdg: Be a bit more pedantic in invert passes

  • panfrost: Enumify bifrost blend types

  • pan/bi: Add texture indices to IR

  • pan/bi: Pipe multiple textures through

  • pan/bi: Pack round opcodes (FMA, either 16 or 32)

  • pan/bit: Add framework forinterpreting double vs float

  • pan/bit: Interpret ROUND

  • pan/bit: Add round tests

  • panfrost: Fix texture field size

  • panfrost: Fix size of bifrost sampler descriptor

  • panfrost: Fix sampler wrap/filter field orders

  • panfrost: Fix norm coords on bifrost sampler

  • panfrost: Fix tiled texture “stride”s on Bifrost

  • pan/decode: Don’t crash on missing payload

  • pan/bi: Enable lower_mediump_outputs NIR pass

  • panfrost: Update Bifrost fields in mali_shader_meta

  • pan/bi: Lower for now sincos

  • pan/mdg: Ingest actual isub ops

  • pan/mdg: Rename .one to .sat_signed

  • pan/mdg: Move constant switch opts to algebraic pass

  • pan/mdg: Drop forever todo

  • pan/mdg: Drop opt in name of midgard_opt_cull_dead_branch

  • pan/mdg: Enable nir_opt_algebraic_distribute_src_mods

  • panfrost: Update dEQP expectation list

  • panfrost: Setup gl_FragCoord as sysval on Bifrost

  • pan/bi: Add clause type for gl_FragCoord.zw load

  • pan/bi: Abort on unknown op packing

  • pan/bi: Abort on unhandled intrinsics

  • pan/bi: Futureproof COMBINE lowering against non-u32

  • pan/bi: Print bad instruction on src packing fail

  • pan/bi: Passthrough direct ld_var addresses

  • pan/bi: Lower gl_FragCoord

  • pan/bi: Set clause type for gl_FragCoord.z

  • pan/bi: Fix double-abs flipping

  • pan/bi: Fix missing swizzle

  • pan/bi: Fix incorrectly flipped swizzle

  • pan/bi: Disable CSEL4 emit for now

  • pan/bi: Fix DISCARD ops in disasm

  • pan/bi: Structify DISCARD

  • pan/bi: Remove BI_GENERIC

  • pan/bi: Unwrap BRANCH into CONDITIONAL class

  • pan/bi: Handle discard_if in NIR->BIR naively

  • pan/bi: Emit discard (not if)

  • pan/bi: Add float-only mode to condition fusing

  • pan/bi: Fuse conditions into discard_if

  • pan/bi: Handle discard/branch in get_component_count

  • pan/bi: Pack ADD.DISCARD

  • pan/bi: Structify ADD ICMP 16

  • pan/bi: Pack ADD ICMP 32

  • pan/bi: Pack ADD ICMP 16

  • pan/bi: Don’t pack ICMP on FMA

  • pan/bit: Add swizzles to round tests

  • pan/bit: Add more 16-bit fmod tests

  • pan/bit: Add ICMP tests

  • pan/bi: Rename BI_ISUB to BI_IMATH

  • pan/bi: Use IMATH for nir_op_iadd

  • pan/bi: Pack FMA IADD/ISUB 32

  • pan/bi: Pack ADD IADD/ISUB for 8/16/32

  • pan/bi: Add SUB.v2i16/SUB.v4i8 opcodes to disasm

  • pan/bi: Don’t schedule <32-bit IMATH to FMA

  • pan/bit: Interpret IMATH

  • pan/bit: Interpret v4i8 ops

  • pan/bit: Remove test names

  • pan/bit: Use swizzle helper for round

  • pan/bit: Factor out identity swizzle helper

  • pan/bit: Add IMATH packing tests

  • pan/decode: Fix flags_hi printing

  • pan/mdg: Explain helper invocations dataflow theory

  • pan/mdg: Analyze helper invocation termination

  • pan/mdg: Analyze helper execution requirements

  • pan/mdg: Use the helper invo analyze passes

  • pan/mdg: Use analysis to set .cont/.last flags

  • pan/mdg: Remove texture_op_count

  • pan/mdg: Set types for derivatives

  • pan/mdg: Fix derivative swizzle

  • panfrost: Run dEQP-GLES3.functional.shaders.derivate.* on CI

  • pan/decode: Use a page table for tracking mmaps

  • pan/decode: Fix min/max_tile_coord mixup

  • pan/mfbd: Add format codes for PIPE_FORMAT_B5G5R5A1_UNORM

  • panfrost: Switch formats to table

  • panfrost: Fix Z24 vs Z32 mixup

  • panfrost: Enable AFBC for Z24X8

  • nir: Add fsat_signed opcode

  • nir: Add fclamp_pos opcode

  • panfrost: Add modifier detection helpers

  • pan/mdg: Remove .pos propagation pass

  • pan/mdg: Drop nir_lower_to_source_mods

  • pan/mdg: Prepare for modifier helpers

  • pan/mdg: Ingest fsat_signed/fclamp_pos

  • pan/mdg: Apply abs/neg modifiers

  • pan/mdg: Treat inot as a modifier

  • pan/mdg: Remove invert optimizations

  • pan/mdg: Use helpers for branch/discard inversion

  • pan/mdg: Apply outmods

  • pan/mdg: Emit fcsel when beneficial

  • pan/mdg: Optimize pipelining logic

  • pan/mdg: Precompute mir_special_index

  • pan/mdg: Optimize liveness computation in DCE

  • pan/mdg: Handle comparisons in fp16 path

  • pan/mdg: Fix constant combining crash

  • pan/mdg: Remove mir_*size routines

  • pan/mdg: Remove mir_get_alu_src

  • pan/mdg: Include more types

  • pan/mdg: Handle dest up/lower correctly with swizzles

  • pan/mdg: Respect !32-bit sizes in RA

  • pan/mdg: Explain ld/st sign/zero extension

  • pan/mdg: Add abs/neg/shift modifiers to IR

  • pan/mdg: Use src_types to determine size in scheduling

  • pan/mdg: Use type to determine triviality of a move

  • pan/mdg: Identify scalar integer mods

  • pan/mdg: Promote imov to fmov on a NIR level

  • pan/mdg: Remove promote_float pass

  • pan/mdg: Defer modifier packing until emit time

  • pan/mdg: Remove redundant redundancy

  • pan/mdg: Streamline dest_override handling

  • pan/mdg: Implement b2f16

  • pan/mdg: Don’t generate conversions for fp16 LUTs

  • pan/mdg: Ignore dest.type when offseting load swizzle

  • pan/lcra: Remove unused alignment parameters

  • pan/lcra: Allow per-variable bounds to be set

  • pan/mdg: Use type size to determine alignment

  • pan/mdg: Eliminate load_64

  • pan/mdg: Set RA bounds for fp16

  • pan/mdg: Print mask when dest=0

  • pan/mdg: Round up bytemasks when spilling

  • pan/mdg: Print constant vectors less wrong

  • pan/mdg: Factor out mir_adjust_constant

  • pan/mdg: Only combine 16-bit constants to lower half

  • pan/mdg: Separately pack constants to the upper half

  • pan/mdg: Fix type checking issues with compute

  • pan/mdg: Pack barriers correctly

  • pan/mdg: Use shifts instead of division for RA sizes

  • pan/mdg: Implement vector constant printing for 8-bit

  • pan/mdg: Implement condense_writemask for 8-bit

  • pan/mdg: Pack 8-bit swizzles in 16-bit ops

  • panfrost: Guard experimental fp16 behind debug flag

  • panfrost: Keep cached BOs mmap’d

  • panfrost: Remove deadcode

  • panfrost: Fill in SCALED formats to format table

  • panfrost: Don’t set PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY

  • panfrost: Don’t zero staging buffer for tiling

  • panfrost: Allow bpp24 tiling

  • panfrost: Allow tiling on RECT textures

  • panfrost: Limit blend shader work count

  • panfrost: Remove dated comment about leaks

  • panfrost: Disable tib read/write when colourmask = 0x0

  • panfrost: Avoid redundant shader executions with mask=0x0

  • panfrost: Don’t set CAN_DISCARD for MFBD

  • panfrost: Fix transform feedback types

  • pan/mdg: Cleanup comments that look like division

  • pan/mdg: Eliminate expand_writemask division

  • pan/mdg: Eliminate 64-bit swizzle packing division

  • pan/mdg: Avoid division in printing helpers

  • pan/mdg: Eliminate remaining divisions from compiler

  • panfrost: Fix dated comment

  • panfrost: Use _mesa_roundevenf when packing clear colours

  • panfrost: Handle !independent_blend for blend shaders

  • pan/mdg: Add pack_colour_32 opcode

  • pan/mdg: Lower shifts to 32-bit

  • pan/mdg: Ensure we don’t DCE into impossible masks

  • pan/mdg: Allow DCE on ld_color_buffer masks

  • panfrost: Add debug print before query flushes

  • panfrost: Only run batch debug when specifically asked

  • nir: Add un/pack_32_4x8 opcodes

  • util: Add SATURATE macro

  • util/format: Use SATURATE

  • mesa: Use SATURATE

  • mesa/swrast: Use SATURATE

  • gallium/draw: Use SATURATE

  • glsl: Use SATURATE

  • panfrost: Use SATURATE

  • softpipe: Use SATURATE

  • intel: Use SATURATE

  • i965: Use SATURATE

  • iris: Use SATURATE

  • etnaviv: Use SATURATE

  • nouveau: Use SATURATE

  • pan/decode: Fix unused variable warning

  • pan/decode: Fix tiler warning

  • pan/decode: Dump missing field on Bifrost

  • pan/decode: Dump unknown2

  • panfrost: Fix Bifrost blending with depth-only FBO

  • panfrost: Adjust null_rt for Bifrost

  • panfrost: Tweak zsbuf magic numbers for Bifrost

  • panfrost: Tweak Bifrost colour buffer magic

  • panfrost: Force Z/S tiling on Bifrost

  • panfrost: Share MRT blend flag calculation with Bifrost

  • panfrost: Set unk2 to accomodate blending

  • panfrost: Identify Bifrost texture format swizzle

  • panfrost: Ensure nonlinear strides are 16-aligned

  • panfrost: Document Midgard Inf/NaN suppress bit

  • panfrost: Add defines for bifrost unk1 flags

  • panfrost: Identify MALI_BIFROST_EARLY_Z flag

  • panfrost: Set MALI_BIFROST_EARLY_Z as necessary

  • pan/decode: Decode Bifrost shader flags

  • pan/bi: Add TEX.vtx opcode for vertex texturing

  • pan/bi: Also add compact vertex texturing

  • pan/bi: Document compute_lod bit for compact tex

  • pan/bi: Allow vertex txl with lod=0 as compact

  • pan/bi: Add f16 TEXC.vtx op

  • pan/bi: Pack compact vertex texturing

  • pan/bi: Add CSEL.16 packing tests

  • pan/bi: Suppress inf/nan for now

  • panfrost: Don’t generate gl_FragCoord varying on Bifrost

  • panfrost: Set reads_frag_coord as a sysval

  • panfrost: Preload gl_FragCoord on Bifrost

  • pan/bi: Remove FMA? parameter from get_src

  • pan/bi: Remove comment about old scheduler design

  • pan/bi: Move bi_registers to common IR structures

  • pan/bi: Move bi_registers to bi_bundle

  • pan/bi: Drop struct from bi_registers

  • pan/bi: Add FILE* argument to bi_print_registers

  • pan/bi: Move bi_flip_ports out of port assignment

  • pan/bi: Document constant count invariant

  • pan/bi: Disassemble pos=0xe

  • pan/bi: Add MUL.i32 to disasm

  • pan/bi: Remove more artefacts of 2-pass scheduling

  • pan/bi: Add bi_layout.c for clause layout helpers

  • pan/bi: Add helper to measure clause size

  • pan/bi: Remove schedule_barrier

  • pan/bi: Allow printing branches without targets

  • pan/bi: Fix emit_if successor assignment

  • pan/bi: Only rewrite COMBINE dest if not SSA

  • pan/bi: Fix CONVERT component counting

  • pan/bi: Fix branch condition typesize

  • pan/bi: Passthrough ZERO in branch packing

  • pan/bi: Add branch constant field to IR

  • pan/bi: Pack branch offset constants

  • pan/bi: Set branch_constant if there is a branch

  • pan/bi: Assign constant port for branch offsets

  • pan/bi: Preliminary branch packing

  • pan/bi: Link clauses back to their blocks

  • pan/bi: Add bi_foreach_clause_in_block_from{_rev} helpers

  • pan/bi: Measure distance between blocks

  • pan/bi: Pack proper clause offsets

  • pan/bi: Set branch_conditional if b2b is set

  • pan/bi: Set back-to-back bit more accurately

  • pan/bi: Set branch conditional bit

  • pan/bi: Pack unconditional branch

  • pan/bi: Defer block naming until after emit

  • pan/bi: Add bi_foreach_block_from_rev helper

  • pan/bi: Measure backwards branches as well

  • pan/bi: Allow two successors in header packing

  • pan/bi: Passthrough deps of the branch target

  • panfrost: Disable QUAD_STRIP/POLYGON on Bifrost

  • panfrost: Add GPU IDs for G31/G52

  • panfrost: Probe G31/G52 if PAN_MESA_DEBUG=bifrost

  • pan/mdg: Handle un/pack opcodes as moves

  • pan/mdg: Add pack_unorm_4x8 via 8-bit

  • pan/mdg: Treat packs “specially”

  • pan/mdg: Handle bitsize for packs

  • pan/mdg: Print 8-bit constants

  • pan/mdg: Drop the u8 from the colorbuf op names

  • pan/mdg: Implement raw colourbuf loads on T720

  • panfrost: Add theory for new framebuffer lowering

  • panfrost: Determine unpacked type for formats

  • panfrost: Add quirks for blend shader types

  • panfrost: Determine load classes for formats

  • panfrost: Determine classes for stores

  • panfrost: Stub out lowering boilerplate

  • panfrost: Un/pack pure 32-bit

  • panfrost: Un/pack pure 16-bit

  • panfrost: Un/pack pure 8-bit

  • panfrost: Un/pack 8-bit UNORM

  • panfrost: Flesh out dispatch

  • panfrost: Un/pack UNORM 4

  • panfrost: Un/pack RGB565 and RGB5A1

  • panfrost: Un/pack RGB10_A2_UNORM

  • panfrost: Un/pack RGB10_A2_UINT

  • panfrost: Un/pack R11G11B10

  • panfrost: Un/pack sRGB via NIR

  • panfrost: Switch to pan_lower_framebuffer

  • panfrost: Conditionally allow fp16 blending

  • panfrost: Account for differing types in blend lower

  • panfrost: Let Gallium pack colours

  • panfrost: Check for large tilebuffer requirements

  • panfrost: Add separate_stencil BO to batch

  • panfrost: Use internal_format throughout

  • panfrost: Update fails list

  • pan/mdg: Handle 16-bit ld_vary

  • pan/mdg: Fuse f2f16 into load_interpolated_input

  • panfrost: Fix PRESENT flag mix-up

  • panfrost: Permit AFBC of RGB8

  • panfrost: Use VTX tag for vertex texturing

  • panfrost: Don’t flush explicitly when mipmapping

  • panfrost: Remove unused nir_lower_framebuffer pass

  • pan/mdg: Disassemble out-of-order bits

  • pan/mdg: Add quirk for missing out-of-order support

  • pan/mdg: Enable out-of-order execution after texture ops

  • nir: Fold f2f16(b2f32(x)) to b2f16(x)

  • pan/mdg: Don’t double-replicate blend on T720

  • pan/mdg: Distinguish blend shaders in internal shader-db

  • pan/mdg: Add roundmode enum

  • pan/mdg: Add opcode roundmode property

  • pan/mdg: Lower roundmodes

  • pan/mdg: Implement *_rtz conversions with roundmode

  • pan/mdg: Fold roundmode into applicable instructions

  • pan/mdg: Handle f2u8

  • pan/mdg: Allow f2u8 and friends thru

  • pan/mdg: Handle regular nir_intrinsic_load_output

  • panfrost: Passthrough NATIVE loads/stores

  • pan/bi: Handle SEL with vec3 16-bit

  • pan/bi: Fix SEL.16 swizzle

  • pan/bi: Pack second argument of F32_TO_F16

  • pan/bi: Passthrough second argument of F32_TO_F16

  • pan/bi: Handle vectorized load_const

  • panfrost: Update MALI_EARLY_Z description

  • panfrost: Document MALI_WRITES_GLOBAL bit

  • panfrost: Handle writes_memory correctly

  • panfrost: Readd MIDGARD_SHADERLESS quirk to t760

  • panfrost: Explicitly convert to 32-bit for logic-ops

  • pan/bi: Disassemble gl_PointCoord reads.

  • panfrost: Prefer sysval for gl_PointCoord on Bifrost

  • panfrost: Fix gl_PointSize out of GL_POINTS

  • panfrost: Mark point sprites as todo on Bifrost

  • pan/mdg: Legalize inverts with constants

  • pan/mdg: Ensure ld_vary_16 is aligned

  • panfrost: Ensure we have ro before using it

  • nir: Remove nir_intrinsic_output_u8_as_fp16_pan

  • pan/mdg: Avoid fusing ld_vary_16 with non-zero component

  • panfrost: Calculate varying size by format

  • panfrost: Add panfrost_streamout_offset helper

  • panfrost: Introduce bitfields for tracking varyings

  • panfrost: Determine varying buffer presence

  • panfrost: Emit unlinked varyings

  • panfrost: Emit special varyings

  • panfrost: Emit xfb records

  • panfrost: Add helper to determine if we are capturing

  • panfrost: Add high-level varying emit

  • panfrost: Use new varying linking

  • panfrost: Remove unused routines

  • panfrost: Allow R/RG/RGB varyings

  • panfrost: Only store varying formats

  • panfrost: Use shader_info harder

  • panfrost: Override varying format to minimal precision

  • panfrost: Demote mediump varyings to fp16

  • pan/mdg: Explicitly type 64-bit uniform moves

  • pan/mdg: Analyze types for 64-bitness in RA

  • pan/mdg: Prefer type over regmode for schedule constraints

  • pan/mdg: Precolour blend inputs

  • panfrost: Merge bifrost_bo/midgard_bo

  • panfrost: Update sampler view in Bifrost path

  • panfrost: Fix level_2

  • panfrost: Correctly calculate tiled stride

  • panfrost: Enable AFBC for RGB565

  • panfrost: Simplify AFBC format check

  • pan/mdg: Factor out unit check

  • pan/mdg: Allow scheduling “x + x” to multipliers

  • pan/mdg: Canonicalize (x * 2.0) to (x + x)

  • pan/mdg: Reassociate adds for multiply-by-two

  • nir: Propagate *2*16 conversions into vectors

  • panfrost: Specify stack_shift on SFBD

  • pan/mdg: Defer nir_fuse_io_16 until after opts

  • pan/mdg: Don’t assign destination in writeout block to r1

  • pan/mdg: Remove bundle interference code

  • pan/mdg: Schedule writeout to VLUT

  • pan/mdg: Defer smul, vlut until after writeout moves

  • pan/mdg: Allow Z/S writes to use any 2nd stage unit

  • pan/mdg: Prioritize non-moves on VADD/VLUT

  • pan/mdg: Skip r1.w write where possible

  • pan/mdg: Schedule based on liveness

  • pan/mdg: Respect type/mask in mir_lower_special_reads

  • pan/mdg: Fix indirect UBO swizzles

  • pan/decode: Fix MSAA texture decoding

  • pan/decode: Identify layered MSAA flag

  • pan/mdg: Allow ignoring move mode

  • pan/mdg: Handle GLSL_SAMPLER_DIM_MS

  • pan/mdg: Handle nir_tex_src_ms_index

  • pan/mdg: Handle nir_texop_txf_ms

  • pan/mdg: Use _VTX tag for texelFetch in frag shaders

  • panfrost: Set depth to sample_count for MSAA 2D

  • panfrost: Identify layer_stride

  • panfrost: Allocate space for multisampling

  • panfrost: Index texture by sample

  • panfrost: Include pointer for each sample

  • panfrost: Set layer_stride for multisampled rendering

  • panfrost: Don’t advertise MSAA 2x

  • panfrost: Identify coverage_mask

  • panfrost: Pass sample_mask to the hardware

  • panfrost: Implement alpha-to-coverage

  • panfrost: Identify depth/stencil layer strides

  • panfrost: Set depth/stencil_layer_stride accordingly

  • panfrost: Enable MSAA if we render to such a surface

  • panfrost: Save sample_mask before blitting

  • panfrost: Expose MSAA 4x

  • glsl: Handle 16-bit types in loop analysis

  • docs/features: Track Panfrost

  • panfrost: Introduce pan_pool struct

  • panfrost: Allocate pool BOs against the pool

  • panfrost: Track the device through the pool

  • panfrost: Expose pool-based allocation API

  • panfrost: Move debug flags into the device

  • panfrost: Drop Gallium-local pan_bo_create wrapper

  • panfrost: Move pool routines to common code

  • panfrost: Factor out scoreboarding state

  • panfrost: Pass polygon_list to tiler init function

  • panfrost: Drop batch from scoreboard routines

  • panfrost: Move scoreboarding routines to common

  • panfrost: Handle PIPE_FORMAT_X24S8_UINT

  • panfrost: Handle PIPE_FORMAT_S8_UINT

  • panfrost: Move panfrost_translate_texture_type

  • panfrost: Report blend shader work count

  • panfrost: Clamp pure int pixels

  • panfrost: Generate shader variants on framebuffer bind

  • panfrost: Always use SOFTWARE for pure formats

  • panfrost: Extend fetched framebuffer results

  • panfrost: Fix fence leak

  • panfrost: Fix write to free’d memory

  • panfrost: Add a sparse array to map GEM handles to BOs

  • panfrost: Index BOs from the BO map sparse array

  • panfrost: Merge PAN_BO_IMPORTED/PAN_BO_EXPORTED

  • panfrost: Remove PAN_BO_COHERENT_LOCAL

  • panfrost: Remove PAN_BO_DONT_REUSE

  • panfrost: Remove panfrost_bo_access type

  • panfrost: Compact unused BO flag bits

  • panfrost: Add format codes for new compressed textures

  • panfrost: Pipe in compressed texture feature mask

  • panfrost: Filter compressed texture formats

  • panfrost: Map PIPE_{DXT, RGTC, BPTC} to MALI_BCn

  • docs/features: Update ASTC entries for Panfrost

  • pan/mdg: Bump compiler RT maximum

  • pan/mdg: Identify per-sample interpolation mode

  • pan/mdg: Implement gl_SampleID

  • panfrost: Force Z/S writeback

  • panfrost: Expose panfrost_get_blend_shader

  • panfrost: Add MALI_PER_SAMPLE bit

  • panfrost: Include sample count in payload estimates

  • panfrost: Identify zs_samples field

  • panfrost: Add rectangle subtraction algorithm

  • panfrost: Handle per-sample shading

  • panfrost: Set zs_samples as necessary

  • panfrost: Track surfaces drawn per-batch

  • panfrost: Extract panfrost_batch_reserve_framebuffer

  • panfrost: Use Midgard-specific reloads

  • panfrost: Call util_blitter_save_fragment_constant_buffer_slot

  • panfrost: Overhaul tilebuffer allocations

  • panfrost: Set PIPE_CAP_MIXED_COLORBUFFER_FORMATS

  • panfrost: Fix sRGB clear colour packing

  • panfrost: Implement Z32F_S8 blits

  • panfrost: Abort on unsupported blit

  • panfrost: Avoid integer underflow in rt_count_1

  • panfrost: Honour cso->compare_mode

  • panfrost: Fix faults with RASTERIZER_DISCARD

  • panfrost: Report CAPs more honestly

  • panfrost: Enable Chromium

  • panfrost: Revert “Disable frame throttling”

  • docs/features: Mark trivial missed feature

  • panfrost: Enable FP16 by default

  • panfrost: Avoid wait=true flushing all batches

  • panfrost: Remove wait parameter to flush_all_batches

  • panfrost: Skip specifying in_syncs

  • panfrost: Allocate syncobjs in panfrost_flush

  • panfrost: Remove unused batch_fence->signaled

  • panfrost: Remove unused batch_fence->ctx

  • pan/bit: Update f32->f16 convert test

  • pan/bit: Remove BI_SHIFT stub

  • pan/mdg: Mask spills from texture write

  • pan/mdg: Test for SSA before chasing addresses

  • docs/features: Add GL_EXT_multisampled_render_to_texture

  • panfrost: Add MSAA mode selection field

  • panfrost: Implement EXT_multisampled_render_to_texture

  • panfrost: Set STRIDE_4BYTE_ALIGNED_ONLY

  • panfrost: Fix WRITES_GLOBAL bit

  • pan/mdg: Ensure barrier op is set on texture

  • panfrost: Fix blend leak for render targets 5-8

  • panfrost: Free cloned NIR shader

  • panfrost: Free NIR of blit shaders

  • panfrost: Free hash_to_temp map

  • pan/mdg: Free previous liveness

  • panfrost: Use memctx for sysvals

  • panfrost: Free batch->dependencies

  • pan/mdg: Fix discard encoding

  • pan/mdg: Fix perspective combination

  • pan/bit: Set d3d=true for CMP tests

Andreas Baierl (1):

  • nir/ lower_int_to_float: Handle umax and umin

Andres Gomez (10):

  • .mailmap: add an alias for Iago Toral Quiroga

  • .mailmap: add an alias for Andres Gomez

  • gitlab-ci: update tracie README after changes in main script

  • scripts: remove unittest.mock dependency when not used

  • gitlab-ci: create always the “results” directory with tracie

  • gitlab-ci: correct tracie behavior with replay errors

  • gitlab-ci: build gfxreconstruct from the “dev” branch

  • gitlab-ci: get the last frame from a gfxr trace using gfxrecon-info

  • gitlab-ci/traces: updated paths and checksums for POLARIS10 traces

  • gitlab-ci: Test AMD’s Raven with traces

Andrey Vostrikov (1):

  • egl/x11: Free memory allocated for reply structures on error

Andrii Simiklit (3):

  • glsl_type: don’t serialize padding bytes from glsl_struct_field

  • i965/vec4: Ignore swizzle of VGRF for use by var_range_end()

  • glsl: fix crash on glsl macro redefinition

Ani (1):

  • drirc: Enable glthread for rpcs3

Anuj Phogat (6):

  • intel/devinfo: Add is_dg1 to device info

  • intel/l3: Add DG1 L3 configuration

  • intel/ehl: Use GEN11_URB_MIN_MAX_ENTRIES in device info

  • intel/ehl: Use macro GEN11_LP_FEATURES in device info

  • intel/ehl: Rename gen_device_info struct

  • intel/ehl: Add new PCI-IDs

Arcady Goldmints-Orlov (4):

  • anv: increase minUniformBufferOffsetAlignment to 64

  • intel/compiler: fix alignment assert in nir_emit_intrinsic

  • nir/spirv/glsl450: increase asin(x) precision

  • intel/compiler: Always apply sample mask on Vulkan.

Axel Davy (19):

  • st/nine: Set correctly blend max_rt

  • gallium/util: Fix leak in the live shader cache

  • ttn: Add new allow_disk_cache parameter

  • ttn: Implement disk cache

  • st/nine: Enable ttn cache

  • radeonsi: Enable tgsi to nir disk cache

  • st/nine: Add checks for pure device

  • st/nine: Return error when setting invalid depth buffer

  • st/nine: Do not return invalidcall on getrenderstate

  • st/nine: Pass more adapter formats for CheckDepthStencilMatch

  • st/nine: Improve return error code in CheckDeviceFormat

  • st/nine: Fix uninitialized variable in BEM()

  • st/nine: Fix a crash if the state is not initialized

  • st/nine: Add missing NULL checks

  • st/nine: Increase available GPU memory

  • st/nine: Retry allocations after freeing some space

  • st/nine: Improve pDestRect handling

  • st/nine: Ignore pDirtyRegion

  • st/nine: Handle full pSourceRect better

Bas Nieuwenhuizen (80):

  • radv: Fix implicit sync with recent allocation changes.

  • radv: Extend tiling flags to 64-bit.

  • radv: Provide a better error for permission issues with priorities.

  • radv: Support VK_PIPELINE_COMPILE_REQUIRED_EXT.

  • radv: Support VK_PIPELINE_CREATE_EARLY_RETURN_ON_FAILURE_BIT_EXT.

  • radv: Support VK_PIPELINE_CACHE_CREATE_EXTERNALLY_SYNCHRONIZED_BIT_EXT.

  • radv: Expose VK_EXT_pipeline_creation_cache_control.

  • radv/winsys: Finish mapping for sparse residency.

  • radv/winsys: Remove extra sizeof multiply.

  • radv: Handle failing to create .cache dir.

  • radv: Remove dead code.

  • radv: Do not close fd -1 when NULL-winsys creation fails.

  • radv: Implement vkGetSwapchainGrallocUsage2ANDROID.

  • frontend/dri: Implement mapping individual planes.

  • util/format: Add VK_FORMAT_D16_UNORM_S8_UINT.

  • util/format: Use correct pipe format for VK_FORMAT_G8_B8_R8_3PLANE_420_UNORM.

  • util/format: Add more multi-planar formats.

  • gallium/dri: Remove lowered_yuv tracking for plane mapping.

  • radeonsi: Explicitly map Z16_UNORM_S8_UINT to None for GFX10.

  • amd/common,radeonsi: Move gfx10_format_table to common.

  • radeonsi: Define gfx10_format in the common header.

  • radv: Include gfx10_format_table.h only from a single source file.

  • radv: Use common gfx10_format_table.h

  • radv: Use ac_surface to determine fmask enable.

  • radv: Pass no_metadata_planes info in to ac_surface.

  • radv: Enforce the contiguous memory for DCC layers in ac_surface.

  • radv: Rely on ac_surface for avoiding cmask for linear images.

  • radv: Use offsets in surface struct.

  • radv: Disable DCC in ac_surface.

  • radv: Disable HTILE in ac_surface.

  • radv: Allocate values/predicates at the end of the image.

  • amd/common: Add total alignment calculation.

  • radv: Use ac_surface to allocate aux surfaces.

  • vulkan/wsi/x11: Ensure we create at least minImageCount images.

  • radv/winsys: Deal with realloc failures in BO lists.

  • radv: Handle mmap failures.

  • radv/winsys: Distinguish device/host memory errors.

  • radv: Make radv_alloc_shader_memory static.

  • turnip: semaphore support.

  • meson: Do not require shader cache for radv.

  • amd/addrlib: fix another C++ one definition rule violation

  • radv: Set handle types in Android semaphore/fence import.

  • radv: Always enable PERFECT_ZPASS_COUNTS.

  • Revert “radv: add support for MRTs compaction to avoid holes”

  • radv: Use correct semaphore handle type for Android import.

  • amd/llvm: Mark pointer function arguments as 32-byte aligned.

  • amd/common: Cache intra-tile addresses for retile map.

  • amd/addrlib: Clean up unused colorFlags argument

  • amd/registers: add RLC_PERFMON_CLK_CNTL for pre-GFX10

  • radeonsi: Inhibit clock-gating for perf counters.

  • meson: Add mising git_sha1.h dependency.

  • amd: Add detection of timeline semaphore support.

  • radv/winsys: Add binary syncobj ABI changes for timeline semaphores.

  • radv: Add thread for timeline syncobj submission.

  • radv: Add winsys support for submitting timeline syncobj.

  • radv: Add winsys functions for timeline syncobj.

  • radv: Add timeline syncobj for timeline semaphores.

  • radv: Fix uninitialized variable in renderpass.

  • vulkan/wsi/x11: report device-group present rectangles with prime.

  • vulkan/wsi: Convert usage of -1 to UINT32_MAX.

  • radv: Fix host->host signalling with legacy timeline semaphores.

  • mesa/st: Actually free the driver part of memory objects on destruction.

  • radv: Don’t use both DCC and CMASK for single sample images.

  • radv: Fix assert that is too strict.

  • radv: Do not consider layouts fast-clearable on compute queue.

  • radv: When importing an image, redo the layout based on the metadata.

  • radv: Use getter instead of setter to extract value.

  • driconf: Support selection by Vulkan applicationName.

  • radv: Override the uniform buffer offset alignment for World War Z.

  • radv: Fix handling of attribs 16-31.

  • radv: Remove conformance warnings with ACO.

  • radv: Update CTS version.

  • radv: Fix 3d blits.

  • radv: Fix threading issue with submission refcounts.

  • radv: Avoid deadlock on bo_list.

  • spirv: Deal with glslang not setting NonUniform on constructors.

  • radeonsi: Work around Wasteland 2 bug.

  • spirv: Deal with glslang bug not setting the decoration for stores.

  • ac/surface: Fix depth import on GFX6-GFX8.

  • st/mesa: Deal with empty textures/buffers in semaphore wait/signal.

Ben Skeggs (38):

  • nir: use bitfield_insert instead of bfi in nir_lower_double_ops

  • nvir: bump max encoding size of instructions

  • nvir: introduce OP_LOP3_LUT

  • nvir: introduce OP_WARPSYNC

  • nvir: introduce OP_BREV with lowering to EXTBF_REV for current GPUs

  • nvir: introduce OP_SHF

  • nvir: introduce OP_BMSK

  • nvir: introduce OP_SGXT

  • nvir: introduce OP_FINAL

  • nvir: add constant folding for OP_PERMT

  • nvir: run replaceZero() before replaceCvt()

  • nvir/nir: fix fragment program output when using MRT

  • nvir/nir: move nir options to codegen

  • nvir/nir: flesh out options

  • nvir/nir: turn on lower_rotate

  • nvir/nir: implement nir_op_extract_u8

  • nvir/nir: implement nir_op_extract_i8

  • nvir/nir: implement nir_op_extract_u16

  • nvir/nir: implement nir_op_extract_i16

  • nvir/nir: implement nir_op_urol

  • nvir/nir: implement nir_op_uror

  • nvir/nir: nir expects the shift amount to wrap, rather than clamp

  • nvir/nir: use nir_lower_idiv

  • nvir/gm107: implement OP_PERMT

  • nvir/gm107: replace SHR+AND+AND with PRMT+PRMT in PFETCH lowering

  • nvir/gm107: separate out header for sched data calculator

  • nvir/nir/gm107: split nir shader compiler options from gf100

  • nvir/nir/gm107: turn on nir_lower_extract64

  • nvir/nir/gm107: switch off lower_extract_byte

  • nvir/nir/gm107: switch off lower_extract_word

  • nvir/gv100: initial support

  • nvir/gv100: enable support for tu1xx

  • nvc0: use NVIDIA headers for GK104->GM2xx compute QMD

  • nvc0: use NVIDIA headers for GP100- compute QMD

  • nvc0: move setting of entrypoint for a shader stage to a function

  • nvc0: remove hardcoded blitter vertprog

  • nvc0: initial support for gv100

  • nvc0: initial support for tu1xx

Benjamin Cheng (1):

  • drirc: Add picom to adaptive_sync exclusion list

Benjamin Tissoires (3):

  • CI: reduce bandwidth for git pull

  • gitlab-ci: update ci-fairy minio to latest upstream

  • gitlab-ci: do not run full CI on scheduled pipelines

Blaž Tomažič (1):

  • radeonsi: Fix omitted flush when moving suballocated texture

Boris Brezillon (14):

  • spirv: Split the vtn_emit_scoped_memory_barrier() logic

  • nir: Replace the scoped_memory barrier by a scoped_barrier

  • intel/compiler: Extract control barriers from scoped barriers

  • spirv: Use scoped barriers for SpvOpControlBarrier

  • nir: Add new rules to optimize NOOP pack/unpack pairs

  • nir: Use a switch in build_deref_offset()/deref_instr_get_const_offset()

  • nir: Allow casts in nir_deref_instr_get[_const]_offset()

  • freedreno: Initialize lower_int64_options to a proper value

  • nir: Stop passing an options arg to nir_lower_int64()

  • nir: Extend nir_lower_int64() to support i2f/f2i lowering

  • intel: Set int64_options to ~0 when lowering 64b ops

  • nir: Get rid of __[u]int64_to_fp32() and __fp32_to_[u]int64()

  • nir: Fix i64tof32 lowering

  • spirv: Add a vtn_get_mem_operands() helper

Boyuan Zhang (2):

  • radeon/vcn/enc: Re-write PPS encoding for HEVC

  • radeon/vcn: bump vcn3.0 encode major version to 1

Brian Ho (14):

  • turnip: Execute ir3_nir_lower_gs pass again

  • turnip: Fill out VkPhysicalDeviceSubgroupProperties

  • nir: Support sysval tess levels in SPIR-V to NIR

  • nir: Add an option for lowering TessLevelInner/Outer to vecs

  • turnip: Lower shaders for tessellation

  • turnip: Offset by component when lowering gl_TessLevel*

  • turnip: Parse tess state and support PATCH primtype

  • turnip: Allocate tess BOs as a function of draw size

  • turnip: Update VFD_CONTROL with tess system values

  • turnip: Emit HS/DS user consts as draw states

  • turnip: Support tess for draws

  • turnip: Force sysmem for tessellation

  • ir3: Unconditionally enable MERGEDREGS on a6xx

  • turnip: Enable tessellationShader physical device feature

Caio Marcelo de Oliveira Filho (32):

  • intel/dev: Bail when INTEL_DEVID_OVERRIDE is not valid

  • intel/fs: Clean up variable group size handling in backend

  • intel/fs: Add an option to lower variable group size in backend

  • intel/fs: Add and use a new load_simd_width_intel intrinsic

  • intel: Let drivers call brw_nir_lower_cs_intrinsics()

  • iris: Implement ARB_compute_variable_group_size

  • util/list: Add list_foreach_entry_from_safe

  • nir: Use deref intrinsics to set writes_memory when gathering info

  • intel/fs: Use writes_memory from shader_info

  • nir: Consider atomic counter intrinsics when setting writes_memory

  • intel/fs: Remove unused emission of load_simd_with_intel

  • intel/fs: Remove unused state from brw_nir_lower_cs_intrinsics

  • intel/fs: Early return when can’t satisfy explicit group size

  • intel/fs: Remove redundant assert()

  • intel/fs: Remove min_dispatch_width spilling decision from RA

  • intel/fs: Support INTEL_DEBUG=no8,no32 in compute shaders

  • intel/fs: Add helper to get prog_offset and simd_size

  • i965: Use new helper functions to pick SIMD variant for CS

  • iris: Set CS KernelStatePointer at dispatch

  • iris: Use new helper functions to pick SIMD variant for CS

  • anv: Use new helper functions to pick SIMD variant for CS

  • intel/fs: Generate multiple CS SIMD variants for variable group size

  • iris, i965: Drop max_variable_local_size

  • iris, i965: Update limits for ARB_compute_variable_group_size

  • intel: Add helper to calculate GPGPU_WALKER::RightExecutionMask

  • nir: Fix printing execution scope of a scoped barrier

  • spirv: Memory semantics is optional for OpControlBarrier

  • intel/fs: Add Fall-through comment

  • nir: Fix logic that ends combine barrier sequence

  • spirv: Handle most execution modes earlier

  • nir: Filter modes of scoped memory barrier in nir_opt_load_store_vectorize

  • spirv: Propagate explicit layout only in types that need it

Charmaine Lee (1):

  • llvmpipe: do not enable tessellation shader without llvm coroutines support

Chris Forbes (12):

  • bifrost: Set RTZ rounding mode for f2i conversion

  • bifrost: Lower x->bool conversions to != 0

  • bifrost: Emit “d3d” variant of comparison instructions

  • bifrost: Document d3d/gl comparison control bit

  • bifrost: Add lowering for b2i32

  • bifrost: Add support for nir_op_inot

  • bifrost: Add support for nir_op_ishl

  • bifrost: Add support for nir_op_uge

  • bifrost: Add support for nir_op_imul

  • bifrost: Add support for nir_op_iabs

  • bifrost: Honor src swizzle in special math ops

  • bifrost: Fix packing of ADD_FEXP2_FAST

Chris Wilson (6):

  • iris: Place a seqno at the end of every batch

  • iris: Convert fences to using lightweight seqno

  • iris: Store a seqno for each batch in the fence

  • iris: Initialise stub iris_seqno to 0

  • iris: Rename iris_seqno to iris_fine_fence

  • iris: Fixup copy’n’paste mistake in Makefile.sources

Christian Gmeiner (31):

  • etnaviv: fix SAMP_ANISOTROPY register value

  • etnaviv: do not use int filter when anisotropic filtering is used

  • ci: bare-metal: make it possible to use a script for serial

  • ci: extend expect-output.sh

  • ci: add U-Boot specific fetch strings

  • etnaviv: drop translate_blend(..)

  • ci: add arm_test-base docker image

  • ci: use separate docker images for baremetal builds

  • ci: fix possible spuriously run of jobs

  • etnaviv: delete not used struct

  • etnaviv: convert enums

  • etnaviv: move etna_lower_io(..) to etnaviv_nir.c

  • etnaviv: get rid of etna_compile dependency

  • etnaviv: move etna_lower_alu(..) to etnaviv_nir.c

  • etnaviv: drop OPT_V define

  • etnaviv: make more use of compile_error(..)

  • etnaviv: move liveness related stuff into own file

  • etnaviv: merge struct etna_compile and etna_state

  • etnaviv: drop emit macro

  • etnaviv: move functions that generate asm to own file

  • etnaviv: move nir compiler related stuff into .c file

  • etnaviv: move ra into own file

  • etnaviv: replace prims-emitted query

  • ci: bare-metal: use nginx to get results from DUT

  • etnaviv: explicitly set nir_variable_mode

  • etnaviv: introduce struct etna_compiler

  • etnaviv: move shader_count to etna_compiler

  • etnaviv: do register setup only once

  • etnaviv: fix nir validation problem

  • etnaviv: call nir_lower_bool_to_bitsize

  • etnaviv: completely turn off MSAA

Christopher Egert (2):

  • radv: use util_float_to_half_rtz

  • r600: Use TRUNC_COORD on samplers

Clément Guérin (1):

  • radv: Always expose non-visible local memory type on dedicated GPUs

Con Kolivas (1):

  • Linux: Change minimum priority threads from SCHED_IDLE to nice 19 SCHED_BATCH.

Connor Abbott (88):

  • tu: Support pipelines without a fragment shader

  • tu: Add a “scratch bo” allocation mechanism

  • tu: Add noubwc debug flag to disable UBWC

  • tu: Implement fallback linear staging blit for CopyImage

  • freedreno/a6xx: Document dual-src blending enable bits

  • ir3: Fixup dual-source blending slot

  • tu: Move RENDER_COMPONENTS setting to pipeline state

  • tu: Implement dual-src blending

  • tu: Advertise COLOR_ATTACHMENT_BLEND_BIT for blendable formats

  • tu: Always initialize image_view fields for blit sources

  • tu: Fall back to 3d blit path for BC1_RGB_* formats

  • tu: Fix buffer compressed pitch calculation with unaligned sizes

  • tu: Support VK_FORMAT_FEATURE_BLIT_SRC_BIT for texture-only formats

  • tu: Fix IBO descriptor for cubes

  • tu: Respect VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT

  • tu: Add missing storage image/texel buffer bits

  • tu: Remove useless post-binning flushes

  • tu: Don’t actually track seqno’s for events

  • tu: Remove useless event_write helpers

  • tu: Rewrite flushing to use barriers

  • tu: Fix context faults loading unused descriptor sets

  • ir3: Pass reserved_user_consts to ir3_shader_from_nir()

  • tu: Remove num_samp hack

  • tu: Use the ir3 shader API

  • tu: Remove tu_shader_compile_options

  • tu: Set num_components to 0 when building bindless intrinsics

  • ir3: Don’t calculate num_samp ourselves

  • tu: Actually remove dead variables after io lowering

  • ir3: Split out variant-specific lowering and optimizations

  • ir3, freedreno: Round up constlen earlier

  • ir3: Include ir3_compiler from ir3_shader

  • ir3: Support variants with different constlen’s

  • ir3: Add ir3_trim_constlen()

  • tu: Share constlen between different stages properly

  • freedreno: Refactor ir3_cache shader compilation

  • freedreno: Share constlen between different stages properly

  • freedreno: On a5xx+ INDX_SIZE is MAX_INDICES

  • freedreno/registers: Label firstIndex field in CP_DRAW_INDX_OFFSET

  • tu: Pass firstIndex directly to CP_DRAW_INDX_OFFSET

  • freedreno/a6xx: use firstIndex field

  • nir: Refactor load/store intrinsic helper

  • nir: add vec2_index_32bit_offset address format

  • tu: Rewrite variable lowering

  • tu: Enable KHR_variable_pointers

  • ir3: Add layer_zero variant bit

  • tu: Force gl_Layer to 0 when necessary

  • freedreno/a6xx: Force gl_Layer to 0 when necessary

  • freedreno: Include adreno_pm4.xml.h before adreno_a6xx.xml.h

  • freedreno: Sync registers with envytools

  • freedreno/a6xx: Rename and document HLSQ_UPDATE_CNTL

  • freedreno/a6xx: Add some documentation for shared consts

  • tu: Don’t invalidate irrelevant state when changing pipeline

  • freedreno/a6xx: Add stencilref register info

  • ir3: Handle gl_FragStencilRefARB

  • tu: Enable VK_EXT_shader_stencil_export

  • freedreno: Add a helper for computing guardband sizes

  • tu: Use common guardband helper

  • freedreno: Use common guardband helper

  • freedreno/ir3: Fix SSBO size for bindless SSBO’s

  • tu: Enable VK_EXT_depth_clip_enable

  • freedreno: Clean up CP_DRAW_MULTI_INDIRECT definition

  • freedreno: Add INDIRECT_COUNT CP_DRAW_INDIRECT_MULTI variants

  • tu: Integrate WFI/WAIT_FOR_ME/WAIT_MEM_WRITES with cache tracking

  • tu: Add missing wfi to tu6_emit_hw()

  • tu: Implement VK_KHR_draw_indirect_count

  • tu: Fix empty blit scissor case

  • tu: Fix hangs for DS with no output

  • tu: Detect invalid-for-binning renderpass dependencies

  • tu: Enable vertex & fragment stores & atomics

  • tu: Fix descriptor update templates with input attachments

  • ir3: Validate bindless samp_tex correctly

  • ir3: Remove redundant samp_tex validation

  • ir3: Fix incorrect src flags for samp_tex

  • tu: Enable resource dynamic indexing

  • freedreno/rnn: Return success when parsing addvariant

  • tu: Dump CP_DRAW_INDIRECT_MULTI draw BO’s

  • freedreno/rnn: Support stripes in rnndec_decodereg

  • freedreno/cffdec: Handle CP_DRAW_INDIRECT_MULTI like other draws

  • freedreno: Add trace for CP_DRAW_INDIRECT_MULTI

  • freedreno/a6xx: Fix CP_BIN_SIZE_ADDRESS name

  • freedreno/rnn: Make rnn_decode_enum() respect variants

  • freedreno/cffdec: Stop open-coding enum parsing

  • freedreno/afuc: Add missing rnn_prepdb()

  • freedreno/afuc: Fix PM4 enum parsing

  • tu: Fix DST_INCOHERENT_FLUSH copy/paste error

  • freedreno: Document draw predication packets

  • tu: Reset has_tess after renderpass

  • tu: Implement VK_EXT_conditional_rendering

D Scott Phillips (4):

  • intel/fs: Update location of Render Target Array Index for gen12

  • anv,iris: Fix input vertex max for tcs on gen12

  • intel/dump_gpu: Fix name of LD_PRELOAD in env append logic

  • anv/gen11+: Disable object level preemption

Daniel Schürmann (54):

  • aco: either copy-propagate or inline create_vector operands

  • aco: coalesce parallelcopies during register allocation

  • nir: add nir_intrinsic_elect to divergence analysis

  • nir: refactor divergence analysis state

  • nir: rework phi handling in divergence analysis

  • nir: simplify phi handling in divergence analysis

  • nir: reset ssa-defs as non-divergent during divergence analysis instead of upfront

  • aco: fix WQM coalescing

  • aco: restrict copying of create_vector operands to GFX9+

  • aco: don’t move create_vector subdword operands to unsupported register offsets

  • aco: fix corner case in register allocation

  • aco: don’t allow unaligned subdword accesses on GFX6/7

  • aco: fix register assignment for p_create_vector on GFX6/7

  • aco: simplify statistics collection for copies

  • aco: use full-register instructions to implement subdword packing on GFX6/7

  • aco: Workarounds subdword lowering on GFX6/7

  • aco: adjust GFX6 subdword lowering workarounds for 8bit

  • aco: add and use scratch SGPR to lower subdword p_create_vector on GFX6/7

  • aco: coalesce copies more aggressively when lowering to hw

  • aco: skip partial copies on first iteration when lowering to hw

  • aco: optimize packing of 16bit subdword registers on GFX6/7

  • aco: remove unnecessary split- and create_vector instructions for subdword loads

  • aco: fix shared subdword loads

  • aco: reorder calls to aco_validate() and cleanup aco_compile_shader()

  • aco: don’t allow SGPRs on logical phis

  • aco: fix WQM handling in nested loops

  • radv/aco: implement logic64 instead of lowering

  • aco: align swap operations to 4 bytes on GFX6/7

  • aco: don’t allow partial copies on GFX6/7

  • radv: introduce RADV_DEBUG=llvm option

  • radv: change use_aco -> use_llvm

  • radv: enable ACO by default

  • aco: fix partial copies on GFX6/7

  • aco: remove superflous (bool & exec) if the result comes from VOPC

  • nir: also move vecN in case of nir_move_copies

  • nir: refactor nir_can_move_instr

  • nir/algebraic: optimize bcsel(a, 0, 1) to b2i

  • nir: also move b2i in case of nir_move_copies

  • nir/algebraic: optimize iand/ior of (n)eq zero

  • nir/algebraic: add optimizations for fsign/isign

  • nir/algebraic: add some more unop + bcsel optimizations

  • nir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x)

  • nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a)

  • nir/algebraic: add distributive rules for ior/iand

  • nir/algebraic: propagate b2i out of ior/iand

  • nir/algebraic: fold some nested bcsel

  • aco: fix scratch loads which cross element_size boundaries

  • aco: ensure to not extract more components than have been fetched

  • aco: don’t split store data if it was already split into more elements

  • aco: prevent infinite recursion in RA for subdword variables

  • aco: ensure readfirstlane subdword operands are always dword aligned

  • radv: call radv_nir_lower_ycbcr_textures after first optimizations

  • aco: add GFX6/7 subdword lowering tests

  • aco: execute branch instructions in WQM if necessary

Daniel Stone (13):

  • CI: Disable Panfrost T7x0 jobs

  • CI: Re-enable Panfrost T7x0 jobs

  • llvmpipe: Expect increased exp precision on Windows

  • CI: Windows: Build LLVM and llvmpipe

  • CI: Disable Panfrost T720/T760

  • Revert “CI: Disable Panfrost T720/T760”

  • CI: Enable assertions on Windows

  • CI: Try shared libraries on Windows

  • CI: Correct build-directory path on Windows, and keep it

  • CI: Re-enable the Windows VS2019 build job

  • CI: Temporarily disable Panfrost T860 jobs

  • CI: Re-enable Panfrost T860 jobs

  • CI: Disable Windows build due to unstable infrastructure

Danylo Piliaiev (25):

  • glsl: rename has_implicit_uint_to_int_conversion to _int_to_uint_

  • i965: Fix out-of-bounds access to brw_stage_state::surf_offset

  • anv: Translate relative timeout to absolute when calling anv_timelines_wait

  • anv: Fix deadlock in anv_timelines_wait

  • meson: Disable GCC’s dead store elimination for memory zeroing custom new

  • mesa: Fix double-lock of Shared->FrameBuffers and usage of wrong mutex

  • st/mesa: Clear texture’s views when texture is removed from Shared->TexObjects

  • intel/fs: Work around dual-source blending hangs in combination with SIMD16

  • glsl: Don’t replace lrp pattern with lrp if arguments are not floats

  • glsl: inline functions with unsupported return type before converting to nir

  • i965: Work around incorrect usage of glDrawRangeElements in UE4

  • st/mesa: account for “loose”, per-mipmap level textures in CopyImageSubData

  • iris: Honor scanout requirement from DRI

  • iris: Fix fast-clearing of depth via glClearTex(Sub)Image

  • nir/opt_if: Fix opt_if_simplification when else branch has jump

  • nir/tests: Add tests for opt_if_simplification

  • st/mesa: Treat vertex outputs absent in outputMapping as zero in mesa_to_tgsi

  • anv/nir: Unify inputs_read/outputs_written between geometry stages

  • spirv: Only require bare types to match when copying variables

  • glsl: Eliminate out-of-bounds triop_vector_insert

  • intel/compiler: Fix pointer arithmetic when reading shader assembly

  • glsl: Eliminate assigments to out-of-bounds elements of vector

  • nir/lower_io: Eliminate oob writes and return zero for oob reads

  • nir/large_constants: Eliminate out-of-bounds writes to large constants

  • nir/lower_samplers: Clamp out-of-bounds access to array of samplers

Daryl W. Grunau (1):

  • prevent multiply defined symbols

Dave Airlie (199):

  • i965: add support for gen 5 pipelined pointers to dump

  • i965: disable shadow batches when batch debugging.

  • draw/tess: free tessellation control shader i/o memory.

  • llvmpipo/nir: free compute shader NIR

  • llvmpipe: simple texture barrier implementation.

  • gallivm/sample: add multisample support for texel fetch

  • gallivm/sample: add multisample image operation support

  • gallivm/nir/tgsi: add multisample texture sampling.

  • gallivm/nir: add multisample support to image size

  • gallivm/nir: add multisample image operations

  • draw: introduce sampler num samples + stride members

  • draw: add support for num_samples + sample_stride to the image paths

  • llvmpipe: add num_samples/sample_stride support to jit textures

  • llvmpipe: add samples support to image jit

  • util: add a resource wrapper to get resource samples

  • llvmpipe: add multisample support to texture allocator.

  • llvmpipe: add a max samples define set to 4.

  • gallium/util: split out zstencil clearing code.

  • llvmpipe: fix race between draw and setting fragment shader.

  • llvmpipe: add get_sample_position support (v2)

  • llvmpipe/jit: pass fragment sample mask via jit context.

  • llvmpipe: pass incoming sample_mask into fragment shader context.

  • llvmpipe: add internal multisample texture mapping path.

  • llvmpipe: add multisample resource copy region support.

  • llvmpipe: add clear texture support for multisample textures.

  • llvmpipe: handle multisample render target clears

  • draw: disable point/line smoothing for multisample (v2)

  • llvmpipe: pass color and depth sample strides into fragment shader.

  • llvmpipe: record sample info for color/depth buffers in scene

  • llvmpipe/rast: fix tile clearing for multisample color and depth tiles

  • llvmpipe: plumb multisample state bit into setup code.

  • llvmpipe: add multisample bit to fragment shader key.

  • llvmpipe: change mask input to fragment shader to 64-bit.

  • llvmpipe: add cbuf/zsbuf + coverage samples to the fragment shader key.

  • gallivm: add sample id/pos intrinsic support

  • gallivm: add mask api to force mask

  • nir/tgsi: translate the interp location

  • llvmpipe: pass interp location into interpolation code.

  • llvmpipe: add centroid interpolation support.

  • llvmpipe: add per-sample interpolation.

  • llvmpipe: move getting mask value out of depth code. (v2)

  • llvmpipe: add per-sample depth/stencil test

  • llvmpipe: move some fs code around

  • llvmpipe: multisample sample mask + early/late depth pass

  • llvmpipe: handle multisample early depth test/late depth write

  • llvmpipe: interpolate Z at sample points for early depth test.

  • llvmpipe: handle multisample color stores.

  • llvmpipe: hook up sample position system value

  • llvmpipe: add multisample alpha to coverage support.

  • llvmpipe: add multisample alpha to one support

  • llvmpipe: handle gl_SampleMask writing.

  • llvmpipe: don’t allow branch to end for early Z with multisample

  • llvmpipe: pass mask store into interp for centroid interpolation

  • llvmpipe: move color storing earlier in frag shader

  • llvmpipe: fix multisample occlusion queries.

  • llvmpipe: disable opaque variant for multisample

  • llvmpipe: add new rast api to pass full 64-bit mask.

  • llvmpipe: add fixed point sample positions to scene.

  • llvmpipe: build 64-bit coverage mask in rasterizer

  • llvmpipe: fixup multisample coverage masks for covered tiles

  • llvmpipe: generate multisample triangle rasterizer functions (v2)

  • llvmpipe: choose multisample rasterizer functions per triangle (v2)

  • llvmpipe: choose correct position for multisample

  • llvmpipe: don’t choose pixel centers for multisample

  • drisw: add multisample support to sw dri layer.

  • llvmpipe: enable 4x sample MSAA + texture multisample

  • gallivm/sample: add num samples query for txqs (v2)

  • gallivm/nir: hooks up texture samples queries

  • llvmpipe: enable GL_ARB_shader_texture_image_samples

  • llvmpipe: add min samples support to the fragment shader.

  • llvmpipe: enable ARB_sample_shading

  • llvmpipe: make sample position a global array.

  • zink: enable conditional rendering if available

  • r600: enable TEXCOORD semantic for TGSI.

  • r600/sfn: plumb the chip class into the instruction emission

  • r600/sfn: fix cayman float instruction emission.

  • r600/sfn: cayman fix int trans op2

  • r600/sfn: add callstack non-evergreen support

  • r600/sfn: add emit if start cayman support

  • llvmpipe: don’t use sample mask with 0 samples

  • llvmpipe: use per-sample position not sample id for interp

  • llvmpipe/interp: fix interpolating frag pos for sample shading

  • llvmpipe: remove non-simple interpolation paths.

  • gallivm/nir: add an interpolation interface.

  • llvmpipe/interp: refactor out use of pixel center offset

  • llvmpipe/interp: refactor out centroid calculations

  • llvmpipe: add interp instruction support

  • llvmpipe/fs: hook up the interpolation APIs.

  • gallivm/nir: add sample_mask_in support

  • llvmpipe: add gl_SampleMaskIn support.

  • r600/sfn: fix nop channel assignment.

  • llvmpipe: compute shaders work better with all the threads.

  • llvmpipe: move coroutines out of noopt case

  • ci: bump virglrenderer to latest version

  • util/disk_cache: add fallback for disk_cache_get_function_identifier

  • llvmpipe/cs: overhaul cs variant key state.

  • llvmpipe/draw: drop variant number from function names.

  • gallivm: rework coroutine malloc/free callouts.

  • gallivm: rework debug printf hook to use global mapping.

  • gallivm: add support for a cache object

  • gallivm: skip operations if we have a cached object.

  • gallivm: add cache interface to mcjit

  • llvmpipe: add infrastructure for disk cache support

  • gallivm: don’t cache shaders that use fetch functions.

  • llvmpipe/fs: add caching support

  • llvmpipe/cs: add shader caching

  • draw: add disk cache callbacks for draw shaders

  • llvmpipe: hook draw disk cache up

  • draw: add disk caching for draw shaders

  • draw/gs: fix emitting inactive primitives crash

  • draw/gs: add more info to debugging.

  • gallivm/nir: add group barrier support

  • llvmpipe: fix subpixel bits reporting.

  • gallivm/format: convert unsigned values to float properly.

  • gallivm/conv: enable conversion min code. (v2)

  • gallivm/sample: fix texel type for stencil 8-bit

  • llvmpipe/setup: add planes for draw regions if no scissor.

  • gallivm/cache: don’t require a null terminator for cache data.

  • mesa/gles3: add support for GL_EXT_shader_group_vote

  • virgl: change vendor id to reflect reality more.

  • llvmpipe: change vendor to be more generic.

  • softpipe: change vendor name to something more generic.

  • gallivm/nir: fix const loading on big endian systems

  • glsl: fix constant packing for 64-bit big endian.

  • gallivm/nir: fix big-endian 64-bit splitting/merging.

  • llvmpipe: fix occlusion queries on big-endian.

  • mesa/get: fix enum16 big-endian getting.

  • draw/llvm: fix big-endian mask adjusting

  • draw: pass nr_samplers into llvm sample state creation.

  • llvmpipe: pass number of samplers into llvm sampler code.

  • gallivm/sample: change texture function generator api

  • gallivm: add indirect texture switch statement builder.

  • draw: add support for indirect texture access

  • llvmpipe: add support for indirect texture access.

  • gallivm/nir: add texture unit indexing

  • gallivm/nir: handle non-uniform texture offsets

  • gallivm/sample: pass indirect offset into texture/image units

  • llvmpipe/draw: wire up indirect offset

  • gallivm/sample: handle size unit offset

  • llvmpipe: enable ARB_gpu_shader5

  • draw: pass number of images to image soa create

  • llvmpipe: pass number of images into image soa create

  • gallivm/nir: support passing image index into image code.

  • gallivm/nir: refactor image operations for indirect support.

  • gallivm/img: refactor out the texel return type (v2)

  • gallivm/nir: add support for indirect image loading

  • draw/sample: add support for indirect images

  • llvmpipe: handle indirect images properly

  • ci: fixup tests after all indirect images fixes.

  • docs: update llvmpipe GL 4.0 status

  • draw/clip: cleanup viewport index handling code.

  • draw/clip: fix viewport index for geometry shaders

  • mesa/version: only enable GL4.1 with correct limits.

  • llvmpipe: bump texture/scene limits to enable GL 4.1

  • llvmpipe: bump to GL support to GL 4.1

  • llvmpipe: enable GL 4.2

  • gallivm/nir: call end prim at end on all GS streams.

  • draw: emit so primitives before ending empty pipeline.

  • draw/gs: fix up current verts in output fetching.

  • gallivm/draw/gs: pass vertex stream count into shader build

  • draw/gs: only allocate memory for streams needed.

  • gallivm/gs_iface: pass stream into end primitive interface.

  • gallivm/nir: don’t access stream var outside bounds

  • gallivm/nir: end primitive for all streams.

  • draw: account primitive lengths for all streams.

  • draw/gs: reverse the polarity of the invocation/prims execution

  • draw: use common exit path in pipeline finish.

  • draw: free vertex info from geometry streams.

  • draw/gs: use mask to limit vertex emission.

  • ci/virgl: update results after streams fixes.

  • llvmpipe: add ARB_post_depth_coverage support.

  • llvmpipe: denote NEW fs when images change.

  • llvmpipe: flush resources on sampler view binding

  • llvmpipe/cs: fix image/sampler binding for compute

  • nouveau: avoid LTO ODR warning (v2)

  • gallivm/sample: always square rho before fast log2

  • llvmpipe/format: fix snorm conversion

  • mesa: change dsa texture error codes for GL 4.6

  • ci: bump piglit checkout for dsa tests

  • llvmpipe: fix stencil only formats.

  • llvmpipe: fix position offset interpolation

  • llvmpipe/cs: respect render condition

  • llvmpipe: add framebuffer fetching support (v1.1)

  • ci/llvmpipe: reenable gpu shader5 tests

  • llvmpipe: enable EXT_texture_shadow_lod

  • llvmpipe/draw: handle constant buffer limits and robustness (v1.1)

  • drisw: add robustness extension support.

  • glx/drisw: add robustness support

  • llvmpipe: add device reset query context hook.

  • llvmpipe: enable robust buffer access + GL 4.3, GLES 3.2 and robust buffer access behaviour

  • llvmpipe/ms: fix sign extension bug in rasterizer.

  • Revert “llvmpipe: Use the default behavior of ALLOW_MAPPED_BUFFERS.”

  • radv: cleanup locking around timeline waiting.

  • llvmpipe: only read 0 for channels being read

  • llvmpipe/blit: for 32-bit unorm depth blits just copy 32-bit

  • llvmpipe: enable GL 4.5

  • llvmpipe/cs: update compute counters not fragment shader.

  • llvmpipe: include gallivm perf flags in shader cache.

  • gallivm: disable brilinear for lod bias and explicit lod.

David McFarland (1):

  • radv: link with ld_args_build_id

David Stevens (2):

  • nir: Add colorspace support to YUV lowering pass

  • i965/i915: Add colorspace support to YUV sampling

Denys (1):

  • gitlab: Ask about reproduction rate in the issue template

Dmitriy Nester (8):

  • mesa: check draw buffer completeness on glClearBufferfv/glClearBufferuiv

  • nir: replace fnv1a hash function with xxhash

  • freedreno: replace fnv1a hash function with xxhash

  • i965: replace fnv1a hash function with xxhash

  • util/hash_table: replace fnv1a hash function with xxhash

  • r600: replace fnv1a hash function with xxhash

  • zink: replace fnv1a hash function with xxhash

  • util: delete fnv1a hash function

Duncan Hopkins (1):

  • zink. Changed sampler default name.

Dylan Baker (41):

  • docs: Add release notes for 20.0.6

  • docs: Add SHA256 sums for 20.0.6

  • docs: update calendar, add news item, and link releases notes for 20.0.6

  • docs: Add release notes for 20.0.7

  • docs/relnotes Add sha256 sums to 20.0.7

  • docs: update calendar, add news item, and link releases notes for 20.0.7

  • tests: Make tests aware of meson test wrapper

  • meson: Bump required version to 0.52.0

  • meson: Use the check_header function

  • meson: Use build_always_stale instead of build_always

  • meson: Use builtins for checking gnu __attributes__

  • drm-shim/meson: The name of the target is a string not a list

  • drm-shim/meson: Use portable override_options for setting C standard

  • meson: use gnu_symbol_visibility argument

  • meson: use 2 space not 3 space indent

  • meson: deprecated ‘true’ and ‘false’ in combo options for ‘enabled’ and ‘disabled’

  • vulkan-overlay/meson: use install_data instead of configure_file

  • docs: Add release notes for 20.0.8

  • docs: Add sha256sums for 20.0.8

  • docs: update calendar, add news item, and link releases notes for 20.0.8

  • mesa/swrast: use logf2 instead of util_fast_log2

  • VERSION: bump for 20.2.0-rc1

  • .pick_status.json: Update to 9333a8570d2174b73da63c3ee6f1a740ae487ab8

  • .pick_status.json: Update to 1e28745bc0d3528c1dfc25459456849feb58d407

  • meson/freedreno: Fix lua requirement

  • .pick_status.json: Update to fdb97d3d2914c8f887a7968432db4fdbd35d8376

  • bump version for 20.2.0-rc2

  • .pick_status.json: Update to 61042b1bdb199f98dd34085ed29a8c492ed9b2a3

  • .pick_status.json: Update to 6d28270968e0728bf8bdf48a6abd261c50d9ef07

  • .pick_status.json: Update to ca7d66e847d08914cec0a5e003b400da9c0a2695

  • VERSION: bump for 20.2.0-rc3

  • .pick_status.json: Update to 7fbded8b5821a47c26245b181446f972f920a96e

  • .pick_status.json: Mark e93979ba599355c42df01a89073362b970489a3a as denominated

  • .pick_status.json: Update to b9927c8c8d0c105699306a68773c015930ff9509

  • VERSION: bump for 20.2.0-rc4

  • .pick_status.json: Update to ef980ac0c1cd65993ba0c1d20e1c09b45bfef99d

  • fix: gallivm: disable brilenear for lod bias and explicit lod.

  • .pick_status.json: Update to a1f46d7b6943699e5efb60fbcfdd1450db85adb1

  • amd/ac_surface: convert tabs to 3 spaces

  • .pick_status.json: Update to 90b98c06493f8a9759e5496d5ec91fb60edf7b92

  • .pick_status.json: Update to 472a20c5fc0feda0f074b4ff95fd7c7a6305c8cd

Eduardo Lima Mitev (2):

  • freedreno: Centralize UUID generation into new files freedreno_uuid.c/h

  • freedreno/uuid: Generate meaningful device and driver UUID

Elie Tournier (12):

  • virgl: implement ARB_clear_texture

  • virgl: Enable CAP_CLEAR_TEXTURE if host supports it

  • docs/features: Add ARB_clear_texture to virgl

  • gallium: add TGSI_PROPERTY_FS_BLEND_EQUATION_ADVANCED

  • glsl_to_tgsi: Set TGSI_PROPERTY_FS_BLEND_EQUATION_ADVANCED

  • virgl: Reserved last caps of capability_bits

  • gallium: Add PIPE_CAP_BLEND_EQUATION_ADVANCED

  • st: expose KHR_blend_equation_advanced if PIPE_CAP_BLEND_EQUATION_ADVANCED

  • glsl_to_ir: do lower_blend_equation if PIPE_CAP_FBFETCH

  • virgl: Use alpha_src_factor to store blend_equation_advenced value

  • virgl: Encode barrier for blend_equation_advanced

  • virgl: set PIPE_CAP_BLEND_EQUATION_ADVANCED

Emmanuel (3):

  • meson: Do not enable USE_ELF_TLS for FreeBSD

  • iris: Explicitly cast value to uint64_t

  • i965: Explicitly cast value to uint64_t

Emmanuel Gil Peyrot (2):

  • util/rand_xor: use getrandom() when available

  • Expose EGL_KHR_platform_* when EXT is supported

Emmanuel Vadot (1):

  • meson: Add versioning for xvmc tracker

Eric Anholt (228):

  • freedreno/ir3: Initialize the unused dwords of the immediates consts.

  • freedreno/ir3: Drop redundant IR3_REG_HALF setup in ALU ops.

  • freedreno/ir3: Leave bools as 1-bit, storing them in full regs.

  • freedreno/ir3: Set up the block predecessors for a3xx TF

  • freedreno/ir3: Fix the a3xx TF outputs stores.

  • freedreno/ir3: Fix register allocation assertion failures.

  • freedreno: Stop doing binning shaders other than the VS in shader-db.

  • freedreno/ir3: Skip tess epilogue if the program is missing stores.

  • freedreno: Fix assertion failures on GS/tess shaders with shader-db enabled.

  • freedreno/ir3: Remove unused half precision shader key flag.

  • freedreno: Emit debug messages when doing draw-time recompiles of shaders.

  • freedreno/ir3: Improve shader key normalization.

  • freedreno/ir3: Stop initializing regid of so->outputs during setup.

  • freedreno/ir3: Set up outputs for multi-slot varyings.

  • freedreno: Immediately compile a default variant of shaders.

  • freedreno/ir3: Set the FS .msaa flag to true during precompiles.

  • freedreno/ir3: Add some more tests of cat6 disasm.

  • freedreno/ir3: Sync some new changes from envytools.

  • freedreno/ir3: Define the bindful uniform/nonuniform desc modes for cat6 a6xx.

  • freedreno/ir3: Disable sin/cos range reduction for mediump.

  • ci: Clean up setup of the job-specific env vars in baremetal testing.

  • ci: Enable IRC flake reporting on freedreno baremetal boards.

  • ci: Improve the flakes reports on IRC.

  • ci: Fix the nick used in IRC reporting.

  • freedreno: Deduplicate ringbuffer macros with computerator/fdperf

  • freedreno: Clean up tests around ORing in the reloc flags.

  • freedreno: Rename append_bo() in case it doesn’t get inlined.

  • freedreno: Initialize the bo’s iova at creation time.

  • freedreno: Start moving relocs flags into the BOs.

  • freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it.

  • freedreno: Mark all ringbuffer BOs as to be dumped on crash.

  • freedreno: Tell the kernel that all BOs are for writing.

  • freedreno: Replace OUT_RELOCW with OUT_RELOC.

  • freedreno: Drop the “write” arg to emit_const_bo now relocs don’t care.

  • nir: Fix count when we didn’t lower load_uniforms but did shift load_ubos.

  • freedreno: Fix non-constbuf-upload UBO block indices and count.

  • freedreno: Add a nohw flag to skip submitting to the kernel.

  • freedreno: Split the fd_batch_resource_used by read vs write.

  • freedreno: Add an early out for preparing to read a resource.

  • freedreno: Move the resource_read early out to an inline.

  • freedreno: Skip taking the lock for resource usage if it’s already flagged.

  • freedreno/a4xx+: Increase max texture size to 16384.

  • freedreno/a6xx: Improve layout testcase logging for UBWC fails.

  • freedreno/a6xx: Add a testcase for UBWC buffer sharing.

  • freedreno: Pull the tile_alignment lookup for a layout to a helper.

  • freedreno/a6xx: Fix UBWC blockheight for RG8.

  • freedreno/a6xx: Fix UBWC mipmap sizing.

  • freedreno/a6xx: Fix UBWC mipmapping height alignment.

  • nir: Include num_ubos in the printed shader (if nonzero).

  • freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa).

  • freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift.

  • freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges.

  • freedreno: Trim num_ubos to just the ones we haven’t lowered to constbuf.

  • freedreno/a6xx: Use LDC for UBO loads.

  • freedreno: Drop the noubo fails list for CI, since there aren’t any now.

  • freedreno: Fix attempts to push UBO contents past the constlen on pre-a6xx.

  • freedreno: Fix resource layout dump loop.

  • freedreno: Avoid duplicate BO relocs in FD_RINGBUFFER_OBJECTs.

  • ci: Move cross file generation to a shared script.

  • ci: Autodetect whether we need cross setup in lava_arm builds.

  • ci: Make cmake toolchain file for deqp cross build setup.

  • ci: Make the create-rootfs more resilient.

  • ci: Update versions of packages to remove from rootfses.

  • ci: Switch the baremetal runner to be an x86 docker image.

  • ci: Disable SMP on the a5xx boards.

  • ci: Make a530’s GLES3/31 fractional runs much more complete.

  • freedreno/a5xx: Move resource layout to fdl.

  • freedreno/fdl: Separate the list of a6xx testcases from the the test code.

  • freedreno/a5xx: Add the outline of a unit test for a5xx layout.

  • freedreno/a5xx: Set MIN_LAYERSZ on 3D textures like we do on a6xx.

  • freedreno/a5xx: Define the 2D blit UBWC pitch fields

  • ci: Fix DEQP_CASELIST_FILTER (used by a630 noubo run)

  • ci: Do an explicit NIR validation-enabled pass on freedreno a630.

  • ci: Don’t forget to set NIR_VALIDATE in baremetal runs.

  • ci: Enable a fractional run with UBO-to-constbuf disabled on a3xx.

  • ci: Improve baremetal’s logging of the job env var passthrough.

  • freedreno/a6xx: Fix the size of buffer image views.

  • freedreno: Fix printing of unused src in disasm of cat6 RESINFO.

  • freedreno: Add more resinfo/ldgb testcases.

  • freedreno: Fix resinfo asm, which doesn’t have srcs besides IBO number.

  • freedreno: Set the immediate flag in a4/a5xx resinfos.

  • freedreno/ir3: Refactor out IBO source references.

  • freedreno/ir3: Move handle_bindless_cat6 to compiler_nir and reuse.

  • freedreno/ir3: Use RESINFO for a6xx image size queries.

  • ci: Drop double “.txt” suffix on the unexpected results file.

  • ci: Drop old comment about enabling –deqp-watchdog.

  • ci: Auto-detect the architecture for VK ICD filenames.

  • ci: Add DEQP_EXPECTED_RENDERER support for VK tests.

  • ci: Move baremetal DEQP_NO_SAVE_RESULTS setup to the yml.

  • ci: Quick exit qpa extraction for non-matching qpas.

  • ci: Disable the firmware loader user helper option in arm64 kernels.

  • ci: Build a cheza kernel.

  • ci: Add scripts for controlling bare-metal chezas.

  • ci: Switch cheza (freedreno a630) testing to baremetal.

  • ci: Don’t build an arm_test container now that the last user is gone.

  • ci: Rename x86_cross_arm_test to just arm_test.

  • turnip: Move vertex buffer bindings to SET_DRAW_STATE.

  • turnip: Don’t bother clamping VB size.

  • turnip: Simplify vertex buffer bindings.

  • turnip: Use tu_cs_emit_regs() for BLEND_CONTROL.

  • turnip: Add support for alphaToOne.

  • freedreno/a6xx: Add support for ALPHA_TO_ONE.

  • freedreno: Upload gallium constbufs as needed when referenced as a UBO.

  • freedreno/ir3: Refactor ir3_cp’s lower_immed().

  • freedreno/ir3: Stop pushing immediates once we’ve filled the constbuf.

  • freedreno/ir3: Drop unnecessary alignment of pushed UBO size.

  • freedreno/ir3: Stop shifting UBO 1 down to be UBO 0.

  • freedreno/ir3: Account for driver params in UBO max const upload.

  • freedreno/ir3: Drop the max_const on a6xx to 512.

  • freedreno/ir3: Handle cases where we decide not to lower UBO 0 loads.

  • turnip: Fix crashes in compute with no descriptors to load.

  • ci: Bump up to the current version of the VK CTS.

  • ci: Disable shader cache on vulkan CI runs.

  • ci: Build the full VK CTS for baremetal testing.

  • ci: Enable pre-merge fractional vulkan CTS runs on the turnip driver.

  • ci: Use rsync for initial nfsroot population on cheza.

  • turnip: Expose robustBufferAccess.

  • freedreno/a6xx: Fix clip_halfz support.

  • ci: Leave a note as to what might be going on with a test.

  • ci: Fix weird filesystem globs appearing in failed test .qpa files.

  • ci: Disable some flaky tests on turnip.

  • ci/bare-metal: Reword the final output of the init script on the board.

  • ci/bare-metal: Make which test to run configurable.

  • ci/bare-metal: Use the deqp-runner bits straight out of the artifacts.

  • ci/bare-metal: Stop fetching the git tree.

  • ci/bare-metal: Terminate the job with an error on kernel panic.

  • docs: Replace ancient swrast conformance docs with more current information.

  • docs: Add dri-devel to the mailing lists and drop the DRI wiki link.

  • ci: disable the windows tests until the runner can be stabilized again

  • ci: Bump vulkan CTS to 1.2.3.0.

  • ci: Enable NIR validation on a630 GLES2 and VK tests.

  • ci/bare-metal: Skip setting of unset variables at startup.

  • ci/bare-metal: Don’t include dev packages in arm*test.

  • ci/tracie: Print the path if the trace isn’t found.

  • ci/tracie: Fix apitrace dump using “less” which isn’t in the ARM rootfs.

  • ci: Add a freedreno a630 tracie run.

  • freedreno/a6xx: Define the register fields for polygon fill mode.

  • turnip: Add support for polygon fill modes.

  • freedreno/a6xx: Add support for polygon fill mode (as long as front==back).

  • ci: Remove a stray “always” on the freedreno traces job.

  • ci/bare-metal: Fail early when we get stuck powering on a cheza.

  • ci/baremetal: Bump the kernel to a recent drm-msm-fixes for msm semaphores.

  • turnip: Do better TU_DEBUG=startup logging of drmGetDevices2() failure.

  • turnip: Fix error handling of DRM_MSM_GEM_INFO ioctls.

  • turnip: Properly return VK_DEVICE_LOST on queuesubmit failures.

  • gallium/util: Add a helper function for point sprite handling.

  • vc4: Enable PIPE_CAP_TGSI_TEXCOORD.

  • v3d: Enable PIPE_CAP_TGSI_TEXCOORD.

  • v3d: Fix -Wmaybe-uninitialized compiler warning in the v33 code.

  • ci: Disable pixmark-piano trace on a630 due to GPU hangs.

  • util: Avoid strict aliasing bugs in xxhash.

  • util: Mark util_format_description() as a const function.

  • softpipe: Clean up softpipe’s SSBO load/store interpreting instructions.

  • util: Remove unused util_format_planar_is_supported().

  • etnaviv: Use the util_pack_color_union() helper.

  • gallium/util: Fix location of the comment about S8_UINT handling.

  • gallium/util: Clean up the Z/S tile write path.

  • gallium/util: Move the Z/S handling to the outside of get_tile().

  • svga: Reuse util_format_unpack_rgba().

  • util: Merge util_format_write_4* functions.

  • util: Merge util_format_read_4* functions.

  • util: Use designated initializers to clean up the format tables’ pack/unpack.

  • llvmpipe: Generalize “could llvmpipe fetch this format” check in unit testing.

  • util: Remove the stub pack/unpack functions for YUV formats.

  • util: Share a single function pointer for the 4-byte rgba unpack function.

  • docs: Move the current CI .rst doc to docs/ci/ and link to it from .gitlab-ci.

  • docs: Move the conformance and the CI docs to a top level Testing section.

  • docs: Move the gitlab-ci docs to RST.

  • docs: Relax the expectations of HW CI farms.

  • docs: Document how to interact with docker containers.

  • freedreno/ir3_cmdline: Fix an uninit var warning.

  • freedreno/ir3: Fix uninit var warning.

  • intel: Fix release-build warnings about sf_entry_size.

  • intel/perf: Fix unused var warning in release builds.

  • intel/perf: Move perf query register programming to static tables.

  • freedreno/a2xx: Fix compiler warning in disasm.

  • meson: Enable GCing of functions and data from compilation units by default.

  • freedreno/ir3: Fix duplicated fine derivatives instructions.

  • freedreno/ir3: Add unit tests for derivatives disasm.

  • ci: Use FDO_CI_CONCURRENT as our -j flags when present in the runner env.

  • freedreno/ir3: Add a note about the instructions in the disasm test.

  • freedreno/ir3: Add a bunch more tests for cat6 opcodes.

  • freedreno/ir3: Refactor cat6 general dst printing.

  • freedreno/ir3: Fix disasm of register offsets in ldp/stp.

  • freedreno/ir3: Add missing ld_args_build_id to the ir3_delay unit test.

  • ci: Set XDG_CACHE_HOME to tmpfs for bare-metal runners to avoid NFS.

  • ci: Update checksums for freedreno traces.

  • llvmpipe: Remove a bunch of default handling of pipe caps.

  • llvmpipe: Use the default behavior of ALLOW_MAPPED_BUFFERS.

  • softpipe: Remove a bunch of default handling of pipe caps.

  • softpipe: Use the default behavior of ALLOW_MAPPED_BUFFERS.

  • virgl: Remove a bunch of default handling of pipe caps.

  • swr: Remove a bunch of default handling of pipe caps.

  • swr: Use the default behavior of ALLOW_MAPPED_BUFFERS.

  • svga: Remove a bunch of default handling of pipe caps.

  • i915: Remove a bunch of default handling of pipe caps.

  • softpipe: Refactor pipe_shader_state setup.

  • softpipe: Convert to comma-separated SOFTPIPE_DEBUG for debug options.

  • softpipe: Add support for reporting shader-db output.

  • softpipe: Enable PIPE_CAP_TGSI_TEXCOORD.

  • softpipe: Enable PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS;

  • ci/bare-metal: Capture the first devcoredump a job produces.

  • drm-shim: Return -EINVAL instead of abort()ing on unknown ioctls.

  • docs: Explain how to set up a personal gitlab runner.

  • nir: Add a pass to cut the trailing ends of vectors.

  • i965: Enable vector shrinking in the vec4 backend.

  • amd: Swap from nir_opt_shrink_load() to nir_opt_shrink_vectors().

  • nir: Remove the old nir_opt_shrink_load.

  • freedreno: Fix “Offset of packed bitfield changed” warnings:

  • nir/lower_amul: Use num_ubos/ssbos instead of recomputing it.

  • nir: Add a little more docs about NIR’s constant_data.

  • nir: Print the constant data size associated with a shader.

  • freedreno/ir3: Fix the type of half-float indirect uniform loads.

  • freedreno/a6xx: Document the bit for the magic 32bit-uniforms-as-16b mode.

  • freedreno/computerator: Set SP_MODE_CONTROL to the same value as vulkan/GL

  • freedreno/ir3: Merge the redundant immediate_idx/immediates_count fields

  • freedreno/ir3: Simpify the immediates from an array of vec4 to array of dwords.

  • freedreno: Rename emit_const_bo() to emit_const_ptrs().

  • freedreno: Split ir3_const’s user buffer and indirect upload APIs.

  • freedreno/ir3: Clean up instrlen setup.

  • freedreno: Increase the NUM_UNIT on compute’s consts in indirect dispatch.

  • freedreno: Add more asserts for DST_OFF/NUM_UNIT in indirect const uploads.

  • freedreno/ir3: Fix assertion failures dumping CS high full regs.

  • turnip: Make sure we include the build id.

  • gallium/tgsi_exec: Fix up NumOutputs counting

  • freedreno: Make the pack struct have a .qword for wide addresses.

  • turnip: Fix truncation of CS shader iovas to 32 bits.

  • turnip: Fix truncation of iovas to 32 bits in queries.

Eric Engestrom (146):

  • cut 20.1 branch

  • docs: update calendar for 20.1.0-rc2

  • post_version.py: fix branch name construction for release candidates

  • post_version.py: invert is_point into is_first_release to make its purpose clearer

  • post_version.py: stop adding release candidates to the index and relnotes

  • docs: update calendar for 20.1.0-rc3

  • gitlab-ci: exclude scripts that don’t affect the build

  • util/rand_xor: make it clear that {,s_}rand_xorshift128plus take exactly 2 uint64_t

  • util/rand_xor: drop unused header

  • util/rand_xor: fallback Linux to time-based instead of fixed seed

  • util/rand_xor: extend the urandom path to all non-Windows platforms

  • docs: update calendar for 20.1.0-rc4

  • anv: pass the fd directly to anv_gem_reg_read()

  • anv: replace magic | 1 with already #define’d name

  • anv: disable VK_EXT_calibrated_timestamps when the timestamp register is unreadable

  • git_sha1_gen.py: fix out-of-date comment

  • git_sha1_gen.py: fix code style

  • git_sha1_gen.py: fix whitespace

  • compiler: delete leftover autotools test wrapper

  • no_extern_c.h: fix typo in comment

  • tree-wide: fix deprecated GitLab URLs

  • docs: drop no-longer-relevant comment about bugzilla

  • docs: Add release notes for 20.1.0

  • docs: update calendar, add news item, and link releases notes for 20.1.0

  • meson: remove “empty array”/”array of an empty string” confusion

  • glapi: remove deprecated .getchildren() that has been replace with an iterator

  • intel/genxml: drop sort_xml.sh and move the loop directly in gen_sort_tags.py

  • intel: fix gen_sort_tags.py

  • docs: Add release notes for 20.1.1

  • docs: update calendar, add news item, and link releases notes for 20.1.1

  • v3d: add missing unlock() in error path

  • intel/genxml: drop python 2 support for gen_sort_tags.py

  • intel/genxml: replace gen_sort_tags.py MIT licence with SPDX equivalent

  • docs: update the blocks of unused EGL enums assigned to us

  • i965: drop dead #include “config.h”

  • iris: drop dead #include “config.h”

  • gen_release_notes.py: update script to the new rST way of things

  • post_version.py: update script to the new rST way of things

  • intel/tools: rewrite run-test.sh in python

  • intel/tools: make test aware of the meson test wrapper

  • khronos-update.py: add script to simplify update of Khronos headers & xml files

  • docs: remove plain-text copy of versions.rst

  • util/os_file: replace broken windows-detection code with detect_os.h

  • util: introduce os_dupfd_cloexec() helper

  • replace all F_DUPFD_CLOEXEC with os_dupfd_cloexec()

  • vulkan/wsi: replace all dup() with os_dupfd_cloexec()

  • radv: replace all dup() with os_dupfd_cloexec()

  • anv: replace all dup() with os_dupfd_cloexec()

  • iris: replace all dup() with os_dupfd_cloexec()

  • i965: replace all dup() with os_dupfd_cloexec()

  • egl: replace all dup() with os_dupfd_cloexec()

  • etnaviv: replace all dup() with os_dupfd_cloexec()

  • freedreno: replace all dup() with os_dupfd_cloexec()

  • svga: replace all dup() with os_dupfd_cloexec()

  • virgl: replace all dup() with os_dupfd_cloexec()

  • docs: publish our release maintainers’ keys

  • docs: remind release maintainers to sign the tarballs and publish their key

  • docs: suggest alternative installation methods for meson

  • docs: stop considering Cc: mesa-stable as an email address

  • docs: reword “sending a patch revision” to “updating a merge request”

  • docs: drop git sendemail instructions

  • docs: prefer Fixes: over Cc: mesa-stable

  • docs: add some formatting to the “backport merge request” option

  • docs: reword a sentence a bit

  • docs: make it clear that the tags needs to be in the commit message

  • docs: move Fixes: tag explanation to its own section

  • docs: move “stable” tag explanation next to Fixes:

  • driconf: drop 28% catalan translation

  • driconf: drop 15% german translation

  • driconf: drop 26% spanish translation

  • driconf: drop 6% french translation

  • driconf: drop 8% dutch translation

  • driconf: drop 9% swedish translation

  • driconf: drop now unused translation facility

  • util: rename xmlpool.h to driconf.h

  • gitlab-ci: drop gettext from the build images

  • docs: drop deleted file from extra sphinx files

  • docs: cat maintainer keys to a single file

  • docs: add some padding to the release calendar

  • docs: add planning for 20.2

  • bin/symbols-check: explain C++ symbols workaround

  • docs: Add release notes for 20.1.2

  • docs: update calendar and link releases notes for 20.1.2

  • docs: fix 20.1.2 relnotes

  • docs: add a page explaining the GitLab CI and the Intel CI

  • mesa/glformats: make _mesa_gles_error_check_format_and_type() more consistent

  • docs: add release notes for 20.1.3

  • docs: update calendar and link releases notes for 20.1.3

  • docs: fix a bunch of typos

  • egl: always compile surfaceless

  • vulkan: automatically compile the display platform when available

  • meson: move xlib-lease block further down

  • egl: automatically compile the drm platform when available

  • introduce commit_in_branch.py script to help devs figure this out

  • bin/gen_release_notes.py: drop new_features.txt when we release XX.Y.0

  • egl/wayland: add missing newline between functions

  • glx: drop always-true #ifdef

  • docs/submittingpatches: add more than one Cc: mesa-stable example to the examples list

  • meson/intel: add missing dep on git_sha1.h

  • meson: fix android vulkan build

  • egl: inline fallback for create_pixmap_surface

  • egl: inline fallback for create_pbuffer_surface

  • egl: drop unused fallback function

  • egl: inline fallback for swap_buffers_with_damage

  • egl: inline fallback for swap_buffers_region

  • egl: inline fallback for post_sub_buffer

  • egl: inline fallback for copy_buffers

  • egl: inline fallback for query_buffer_age

  • egl: inline fallback for create_wayland_buffer_from_image

  • egl: inline fallback for get_sync_values

  • egl: drop now empty egl_dri2_fallbacks.h

  • egl: mark the rest of the callbacks as mandatory or optional

  • egl: inline _EGLAPI into _EGLDriver

  • docs: add release notes for 20.1.4

  • docs: update calendar and link releases notes for 20.1.4

  • post_version.py: don’t generate relnotes twice

  • post_version.py: drop incorrect conf.py changes

  • post_version.py: stop using non-existent functions and fix commit message

  • post_version.py: update the files in the current worktree, not the one with the script that we run

  • post_version.py: fix relnotes links

  • bin/gen_release_notes: automatically commit release notes

  • docs/releasing: improve wording

  • bin/khronos-update: having a folder in include/ is not a requirement

  • bin/khronos-update: add support for the SPIRV files

  • bin/khronos-update: add workaround for python bug 9625

  • egl: replace _eglInitDriver() with a simple variable

  • egl: drop unnecessary _eglGetDriver()

  • egl: fix _eglMatchDriver() return type

  • egl: inline _eglMatchAndInitialize() and refactor _eglMatchDriver()

  • egl: rename _eglMatchDriver() to _eglInitializeDisplay()

  • egl: drop left-over function prototype

  • egl: const _eglDriver

  • egl/haiku: drop overwritten preset of EGL version

  • egl: consistently use dri2_egl_display() helper macro

  • meson: fix -D xlib-lease=auto detection

  • docs: add release notes for 20.1.5

  • docs: update calendar and link releases notes for 20.1.5

  • pick-ui: specify git commands in “resolve cherry pick” message

  • egl/entrypoint-check: split sort-check into a function

  • egl/entrypoint-check: add check that GLVND and plain EGL have the same entrypoints

  • driconf: fix force_gl_vendor description

  • meson: bump required glvnd version

  • egl/x11_dri3: enable & require xfixes 2.0

  • egl/x11_dri3: implement EGL_KHR_swap_buffers_with_damage

  • meson: don’t advertise TLS support if glx wasn’t build with it

  • meson: drop leftover PTHREAD_SETAFFINITY_IN_NP_HEADER

Erico Nunes (16):

  • lima/ppir: introduce liveness internal live set

  • lima/ppir: fix lod bias register codegen

  • lima/ppir: do not assume single src for pipeline outputs

  • lima/ppir: combine varying loads in node_to_instr

  • lima/ppir: duplicate intrinsics in nir

  • lima/ppir: duplicate consts in nir

  • lima/ppir: remove unused clone functions

  • lima/ppir: rework emit nir to ppir

  • lima/ppir: rework store output

  • lima/ppir: add fallback mov option for const scheduler

  • lima/ppir: rework select conditions

  • lima/ppir: handle failures on all ppir_emit_cf_list paths

  • lima/ppir: improve handling for successors in other blocks

  • lima/ppir: rework tex lowering

  • lima/ppir: optimize tex loads with single successor

  • lima/ppir: use a ready list in node_to_instr

Erik Faye-Lund (124):

  • compiler/nir: move tan-calculation to helper

  • vtn/opencl: add native_tan-support

  • vtn/opencl: native variants of sin/cos

  • vtn/opencl: native divide support

  • vtn/opencl: native powr support

  • vtn/opencl: native recip support

  • vtn/opencl: native rsqrt support

  • vtn/opencl: native sqrt support

  • compiler/glsl: explicitly store NumUniformBlocks

  • mesa/st: consider NumUniformBlocks instead of num_ubos when binding

  • zink: use nir_lower_uniforms_to_ubo

  • zink: lower b2b to b2i

  • util/os_memory: never use os_memory_debug.h

  • st/wgl: pass st_context_iface into stw_st_framebuffer_present_locked

  • st/wgl: allocate and resolve msaa-textures

  • docs/features: add zink features

  • zink: load vk_GetMemoryFdKHR while creating screen

  • zink: add a GET_PROC_ADDR macro to simplify load_device_extensions

  • docs/features: mark GL_NV_conditional_render as done for zink

  • zink: disable vkCmdResolveImage when respecting render-condition

  • zink: do not expose real value for PIPE_CAP_MAX_VIEWPORTS

  • zink: correct PIPE_SHADER_CAP_MAX_SHADER_IMAGES

  • zink: mark depth-component cube-maps as done

  • zink: implement i2b1

  • docs: fix broken release-calendar

  • zink: hammer in an explicit wait when retrieving buffer contents for reading

  • zink: use samples from state

  • zink: do not dig into resource for nr_samples

  • zink: pass batch instead of context for queries

  • zink: implement nir_texop_txf_ms

  • zink: expose PIPE_CAP_TEXTURE_MULTISAMPLE

  • docs/features: mark GL_ARB_texture_multisample as done for zink

  • zink: use general-layout when blitting to/from same resource

  • zink: Use store_dest_raw instead of storing an uint

  • nir: reuse existing psiz-variable

  • zink: emulate B8G8R8X8_SRGB with B8G8R8A8_SRGB

  • zink: assert that image-view format isn’t undefined

  • zink: only report device-local memory as video-memory

  • gallium/hud: do not specify potentially invalid depth-range

  • TEMP: add rst-conversion scripts

  • docs: convert articles to reructuredtext

  • TEMP: remove rst-conversion scripts

  • docs: delete no longer needed file

  • docs: fixup botched table

  • docs: escape double colons

  • docs: escape asterisks

  • docs: escape trailing underscores properly

  • docs: fixup broken rst

  • docs: fixup heading-levels

  • docs: use sphinx

  • docs: disable syntax-highlighting by default

  • docs: use code-block with caption instead of table

  • docs: format notes as rst-notes

  • docs: use code-blocks

  • docs: drop open-coded toc for articles

  • docs: add xlibdriver to table-of-contents

  • docs: do not copy source-files to site

  • docs: use rst footnotes instead of manual ones

  • docs: reformat license table as rst table

  • docs: use rst-note for highlighted text

  • docs: bundle extra files

  • docs: include specs into the generated docs

  • gitlab-ci: build and deploy docs

  • docs: drop news in favour of the introduction as index-page

  • README: update references to internal docs

  • docs: update internal references

  • docs/relnotes: update internal references

  • radv: update internal reference

  • bin/perf-annotate-jit.py: update internal reference

  • docs/release-calendar: restore missing id

  • nir: do not try to merge xfb-outputs

  • Revert “gallium/hud: don’t use user vertex buffers”

  • gallium/hud: don’t use user vertex buffers

  • zink: enable cull-distance if supported

  • zink: expose GLSL 1.30

  • docs: update internal references

  • docs/relnotes: update internal references

  • docs: fixup relnotes after rst-conversion

  • docs/features: mark GL3 as complete for zink

  • docs/features: update ARB_texture_buffer_object line

  • docs/features: remove driver-list for forward-compatible context

  • mesa/main: fix inverted condition

  • gallium/os: call “ANSI” version of GetCommandLine

  • graw/gdi: do not depend on UNICODE macro

  • gallium/util: limit STACK_LEN on Windows

  • gallium/util: add missing include

  • docs: update favicon

  • docs: remove non-existent reference

  • docs: restore accidentally dropped labels

  • docs: fix internal references

  • docs: use ref-links for internal references

  • gallium/docs: update to recent sphinx

  • gallium/docs: fixup formatting of numbered lists

  • gallium/docs: remove reference to non-existent label

  • gallium/docs: use none for highlight_language

  • gallium/docs: prefix exts dir with underscore

  • gallium/docs: remove non-existent static dir

  • gallium/docs: remove unused imgmath extension

  • ci: only build docs in the upstream-repo

  • ci: only build docs if any docs changed

  • ci: test docs for non-master builds

  • ci: move deploy-stage later in the pipeline

  • ci: move test-docs to container stage

  • ci: add graphviz to the .docs-base template

  • merge gallium docs into main docs

  • docs: clean up gallium index-file

  • docs: add an extension to generate redirects

  • docs: move gallium specific docs into gallium folder

  • docs: use svg for graphviz output

  • docs: fixup envvar output

  • zink: expose depth-clip if supported

  • mesa/main: factor out one-time-init into a helper

  • mesa/main: use call_once instead of open-coding

  • gallium/util: do not use _MTX_INITIALIZER_NP on Windows

  • mesa/main: use p_atomic_inc_return instead of locking

  • mesa: do not use bitfields for advanced-blend state

  • mesa: treat Color._AdvancedBlendMode as enum

  • zink: use ralloc in nir-to-spirv

  • zink: use ralloc for plain malloc-calls

  • zink: pass mem_ctx to ralloc_size-call

  • zink: use ralloc for spirv_builder as well

  • mesa/program: fix shadow property for samplers

  • docs: add some very basic documentation about zink

  • mesa: handle GL_FRONT after translating to it

Francisco Jerez (23):

  • intel/ir: Update performance analysis parameters for memory fence codegen changes.

  • iris: Simplify iris_batch_prepare_noop().

  • iris: Extend iris_context dirty state flags to 128 bits.

  • iris: Add batch-local synchronization book-keeping to iris_bo.

  • iris: Add infrastructure to partition batch into sync boundaries.

  • iris: Bracket batch operations which access memory within sync regions.

  • iris: Annotate all BO uses with domain and sequence number information.

  • iris: Drop redundant iris_address::write flag.

  • iris: Report use of any in-flight buffers on first draw call after sync boundary.

  • iris: Introduce cache coherency matrix for batch-local memory ordering.

  • iris: Update cache coherency matrix on PIPE_CONTROL.

  • iris: Implement buffer-local memory barrier based on cache coherency matrix.

  • iris: Insert buffer barrier in existing cache flush helpers.

  • iris: Remove batch argument of iris_resource_prepare_access() and friends.

  • iris: Perform compute predraw flushes from compute batch.

  • iris: Remove depth cache set tracking and synchronization.

  • iris: Remove render cache hash table-based synchronization.

  • iris: Open-code iris_cache_flush_for_read() and iris_cache_flush_for_depth().

  • iris: Emit single render target flush PIPE_CONTROL on format mismatch.

  • iris: Remove iris_flush_depth_and_render_caches().

  • OPTIONAL: iris: Perform BLORP buffer barriers outside of iris_blorp_exec() hook.

  • iris/icl+: Report same caching domain as main surface for clear color BO.

  • intel/ir/gen12+: Work around FS performance regressions due to SIMD32 discard divergence.

Frank Binns (2):

  • docs: change “Fixes:” tag example to match git fixes output

  • egl/dri2: only take a dri2_dpy reference when binding a new context/surfaces

Frédéric Bonnard (2):

  • clover: Fix types collision between c++ and altivec

  • meson: Revert commit overriding C++ standard with gnu++11 on ppc64el

Gert Wollny (66):

  • r600: Annotate some case fallthroughs

  • r600: remove unused static functions

  • r600/sb: replace memset by using member initialization/assignment

  • r600: remove some unused variables to silence warnings

  • r600: Fix warning regarding mixing enums and unsigned in ?: expression

  • r600: Fix nir compiler options, i.e. don’t lower IO to temps for TESS

  • r600/sfn: Unify semantic name and index query and use TEXCOORD semantic

  • r600/sfn: Fix printing vertex fetch instruction flags

  • r600: Lower int64 ops from TGSI-to-NIR shaders too

  • r600: Lower lerp after tgsi_to_nir

  • r600: Add support for loading index register from other than chan X

  • r600/sfn: Handle CF index loading from non-X channel

  • r600/sfn: rework getting a vector and uniforms from the value pool

  • r600/sfn: Skip move instructions if they are only ssa and without modifiers

  • r600/sfn: re-use an allocated register in lookup

  • r600/sfn: skip copying LOD if the target register is is the same

  • r600/sfn: Fix memring print output

  • r600/sfn: Fix RING instruction assembly emission

  • r600/sfn: Fix GDS assembly emission

  • r600/sfn: Fix RAT instruction assembly emission

  • r600/sfn: Make allocate_reserved_registers forward to a virtual function

  • r600/sfn: Fix handling of output register index

  • r600/sfn: Make 3vec loads skip possible moves

  • r600/sfn: Add support for viewport index output

  • r600/sfn: Take FOGC, and backcolors into account im GS outputs

  • r600/sfn: Handle loading sample_pos

  • r600/sfn: Add FS output sample_mask

  • r600/sfn: Don’t reject VARYING_SLOT_PCNT

  • r600/sfn: remove pointless check

  • r600/sfn: assert when alu dest is missing

  • r600/sfn: support indirect sampler buffer reads.

  • r600/sfn: Add support for texture_samples

  • r600/sfn: use the per shader atomic base

  • r600/sfn: SSBO: Fix query of dest components

  • r600/sfn: Fix clip vertex output as possible stream variable

  • r600/sfn: Fix splitting constants that come from different kcache banks.

  • r600/sfn: Don’t reorder outputs by location

  • r600/sfn: Fix printing ALU op without dest

  • r600: Fix duplicated subexpression in r600_asm.c

  • r600/sfn: Fix mapping for f32tof64 and f64tof32

  • r600/sfn: use modern c++ in printing LDS read instruction

  • r600/sfn: Correctly update the number of literals when forcing a new group

  • r600/sfn: remove debug output leftover

  • nir: lower_tex: Don’t normalize coordinates for TXF with RECT

  • r600/sfn: lower image derefs

  • r600/sfn: Add imageio support

  • r600/sfn: Add support for image_size

  • r600/sfn: Add support for reading cube image array dim.

  • r600/sfn: Take SSBO buffer ID offset into account

  • r600/sfn: Handle memory_barrier

  • r600/sfn: Add lowering pass for shared IO

  • r600/sfn: Add support for shared atomics

  • r600/sfn: Don’t set num_components on TESS sysvalue intrinsics

  • r600/sfn: lower rotate ALU ops

  • r600/sfn: Pipe through requesting a register at a given channel

  • r600/sfn: emit texture instructions in one block

  • r600/sfn: Add option to get a temp value for a specific channel

  • r600/sfn: correct handling of loading vec4 with fetching constants

  • r600/sfn: Add a forced output swizzle for depth write

  • r600/sfn: Fix Ring output swizzle masks

  • r600/sfn: Fix default z swizzle for GDS instructions

  • r600: Add shader key item to identify when the sample mask should be used

  • r600/sfn: Only use sample mask if the according shader key is set

  • r600/sfn: Make the pin_to_channel generic

  • d600/sfn: write stream outputs to correct mem ring

  • gallivm/nir: Lower uniforms to UBOs in llvm draw if the driver didn’t request this already

Greg V (1):

  • gallium,util: undef ALIGN on FreeBSD to prevent name clash

Guido Günther (2):

  • etnaviv: drm: Use NSEC_PER_SEC

  • etnaviv: drm: Normalize nano seconds

Gurchetan Singh (1):

  • virgl: apply bgra dest swizzle and add Portal 2

Hanno Böck (1):

  • Properly check mmap return value

Hyunjun Ko (6):

  • freedreno,tu: Don’t request fragcoord components not being read.

  • tu,radv: fix potentially wrong offset of flexible array.

  • vulkan: Adds helpers for vk_object (de)alloation and (de)initialization.

  • tu: Fix wrong copies of sampler descriptor.

  • turnip: Use the common base object type and struct.

  • turnip: implement VK_EXT_private_data

Iago Toral Quiroga (7):

  • v3d/compiler: don’t rewrite unused temporaries to point to NOP register

  • v3d/compiler: fix spill offset

  • v3d/compiler: fix image size for 1D arrays

  • nir/lower_clip: make the pass compatible with Vulkan semantics

  • v3d/compiler: handle compact varyings

  • v3d/compiler: request fragment shader clip lowering to be vulkan compatible.

  • nir/lower_tex: skip lower_tex_packing for the texture samples query

Ian Romanick (24):

  • nir/algebraic: Recognize open-coded byte or word extract from bfe

  • nir/algebraic: Split ibfe and ubfe with two constant sources

  • nir/algebraic: Optimize some bfe patterns

  • nir/algebraic: Optimize ushr of pack_half, not ishr

  • nir/algebraic: Add some half packing optimizations for pack_half_2x16_split

  • nir/algebraic: Eliminate useless extract before unpack

  • i965: Assert that blorp always handles color blits

  • meta: Make _mesa_meta_texture_object_from_renderbuffer static

  • meta: Make _mesa_meta_setup_sampler static

  • meta: Remove support for clearing integer buffers

  • mesa: Add matrix utility functions to load matrices

  • mesa: Add function to calculate an orthographic projection

  • meta: Stop frobbing MatrixMode

  • meta: Use same vertex coordinates for GLSL and FF clears

  • meta: Coalesce the GLSL and FF paths in meta_clear

  • meta: Remove support for multisample blits

  • anv/tests: Don’t rely on assert or changing NDEBUG in tests

  • anv/tests: Silence unused parameter warnings in main

  • anv: Silence unused parameter warning in anv_image_get_clear_color_addr

  • intel: Silence unused parameter warning in __intel_log_use_args

  • intel/drm-shim: Add noop ioctl handler for set_tiling

  • intel/drm-shim: Return correct values for I915_PARAM_HAS_ALIASING_PPGTT

  • glsl: Remove integer matrix support from ir_dereference_array::constant_expression_value

  • nir/algebraic: Don’t distrubte absolute-value into dot-products

Icecream95 (78):

  • pan/midgard: Fix old style shadows

  • panfrost: Fix background showing when using discard

  • panfrost: Enable PIPE_CAP_VERTEX_COLOR_UNCLAMPED

  • panfrost: Decode AFBC flag bits

  • panfrost: Only use AFBC YTR with RGB and RGBA

  • pan/midgard: Use a signed value for checking inline constants

  • Revert “panfrost: Keep cached BOs mmap’d”

  • panfrost: Mark PIPE_BUFFER BOs as not renderable

  • pan/mdg: Add a macro for printing instruction source information

  • pan/mdg: Move r1.w writeout to branch->dest

  • pan/mdg: Remove old zs store lowering

  • pan/mdg: Remove old depth writeout code

  • pan/mdg: Remove writeout case from bytemask_of_read_components

  • nir: Replace the zs_output_pan intrinsic with combined_output_pan

  • pan/mdg: Replace writeout booleans with a single value

  • pan/mdg: Add new depth writeout code

  • pan/mdg: Move search_var to earlier in midgard_compile.c

  • pan/mdg: Add depth/stencil support to emit_fragment_store

  • pan/mdg: Add new depth store lowering

  • pan/mdg: Print writeout sources in mir_print_instruction

  • panfrost: Add writes_stencil to the EARLY_Z disable list

  • panfrost: Move sampler view bo creation to a separate function

  • panfrost: Create a new sampler view bo when the layout changes

  • panfrost: Tiled to linear layout conversion

  • panfrost: Clean up panfrost_frag_meta_rasterizer_update

  • panfrost: Implement ARB_depth_clamp

  • pan/decode: Fix helper invocations when tracing

  • pan/decode: Add missing wrap modes

  • pan/mdg: Fix max_comp calculation for constant printing

  • panfrost: RGBA4 and RGB5_A1 framebuffer support

  • panfrost: Update sampler views when the texture bo changes

  • panfrost: Copy resources when mapping to avoid waiting for readers

  • panfrost: Only copy resources when they are in a pending batch

  • panfrost: Add PAN_MESA_DEBUG=gl3 flag

  • panfrost: Do fine-grained flushing for occlusion query results

  • pan/mdg: Vectorize vlut operations

  • pan/decode: Make mapped memory read-only while decoding

  • nir: Add a base value to load_raw_output_pan

  • panfrost: Fix MALI_READS_TILEBUFFER

  • pan/mdg: Handle tilebuffer wait loops

  • pan/mdg: Use the writeout tag for tilebuffer wait loops

  • panfrost: Add rt formats to shader state

  • panfrost: Add a bitset of render targets read by shaders

  • pan/mdg: Do the pan_lower_framebuffer pass later

  • pan/mdg: Emit a tilebuffer wait loop when needed

  • pan/mdg: Handle non-blend framebuffer lowering

  • pan/mdg: Support MRT in output load lowering

  • pan/mdg: Set the z/s store intrinsic base correctly

  • pan/mdg: Use a 32-bit ld_color_buffer op when needed

  • panfrost: Implement texture_barrier

  • panfrost: Stop keying on rt format when using native loads

  • panfrost: Use f2fmp for framebuffer lowering conversions

  • panfrost: Enable framebuffer fetch

  • pan/mdg: Fix non-debug compiliation

  • compiler: Add dual-source factors to blend_factor

  • gallium: Dual source support in blend_factor_to_shader

  • pan/mdg: Add a nir pass to reorder store_output intrinsics

  • pan/mdg: Dual source blend input/writeout support

  • pan/mdg: Skip z/s combining for dual-source writes

  • panfrost: Dual source blend support

  • pan/decode: Open the dump file later

  • pan/mdg: Don’t disassemble blit shaders

  • panfrost: Rename lower_store to is_blend in pan_lower_framebuffer

  • pan/mdg: Do per-sample framebuffer loads

  • panfrost: Do per-sample shading when outputs are read

  • nir: Add a face_sysval argument to nir_lower_two_sided_color

  • nir: Fix lower_two_sided_color when the face is an input

  • panfrost: Report TEXTURE_BUFFER_OBJECTS cap when gl3 flag set

  • panfrost: Set depth_enabled when stencil is enabled

  • nir: Set the alignment for SSBO lowering

  • panfrost: Make panfrost_bo_wait take a wait_readers bool

  • panfrost: Fix calls to panfrost_flush_batches_accessing_bo

  • panfrost: Fake RGTC support

  • panfrost: Use more tilebuffer sizes

  • panfrost: 8x MRT support

  • pan/mdg: Use the blend RT for blend shader framebuffer fetches

  • panfrost: Allow PIPE_TEXTURE_1D_ARRAY textures

  • pan/mdg: Fix spilling of non-32-bit types

Icenowy Zheng (1):

  • panfrost: signal syncobj if nothing is going to be flushed

Ilia Mirkin (14):

  • freedreno/a3xx: there’s no r8i/ui rb format, only rg8i/rg8ui

  • freedreno/a3xx: reinstate rgb10_a2ui texture format

  • freedreno/ir3: avoid applying (sat) on bary.f

  • freedreno/a3xx: fix const footprint

  • freedreno: fix off-by-one in assertions checking for const sizes

  • freedreno/a3xx: parameterize ubo optimization

  • freedreno/a3xx: fix rasterizer discard

  • nouveau: allow invalidating coherent/persistent buffer backings

  • st/mesa: allow R8 to not be exposed as renderable by driver

  • a4xx: add noperspective interpolation support

  • a4xx: add polygon offset clamp, fix units

  • ir3: mark ucp_enables as allowed values on all keys

  • a4xx: hook up centroid ij coords

  • ir3: use empirical size for params as used by the shader

Indrajit Kumar Das (2):

  • st/mesa: use fragment shader to copy stencil buffer

  • st/mesa: optimize DEPTH_STENCIL copies using fragment shader

Italo Nicola (17):

  • panfrost: Fix outmods on int to float conversions

  • pan/mdg: fix src_type in instructions that need a implicit zero

  • pan/mdg: prepare effective_writemask()

  • pan/mdg: eliminate references to ins->alu.op

  • pan/mdg: eliminate references to ins->alu.reg_mode

  • pan/mdg: fix comment

  • pan/mdg: eliminate references to ins->alu.outmod

  • pan/mdg: apply float outmods to textures

  • pan/mdg: eliminate references to ins->texture.op

  • pan/mdg: eliminate references to ins->load_store.op

  • pan/mdg: defer register packing

  • pan/mdg: externalize mir_pack_mod

  • pan/mdg: remove ins->alu

  • pan/mdg: refactor emit_alu_bundle

  • pan/mdg: defer branch packing

  • pan/mdg: remove ins->br_compact and ins->branch_extended

  • pan/mdg: emit REGISTER_UNUSED on unused ALU src2

Iván Briano (9):

  • anv: use the correct format on Android

  • anv: Disable B5G6R5_UNORM_PACK16

  • anv: Add a way to reserve states from a pool

  • anv: Implement VK_EXT_custom_border_color

  • anv: support externally synchronized pipeline caches

  • anv: implement VK_PIPELINE_CREATE_FAIL_ON_PIPELINE_COMPILE_REQUIRED_BIT_EXT

  • anv: enable VK_EXT_pipeline_creation_cache_control

  • anv: Add VK_EXT_custom_border_color to relnotes

  • anv: fix allocation of custom border color pool

James Park (1):

  • amd/llvm: Reorder LLVM headers

James Zhu (1):

  • ac/gpu_info: Correct Acturus cu bitmap

Jan Beich (5):

  • drm-uapi: Add sync_file.h

  • anv,iris: unbreak on BSDs after 812cf5f522ab,abf8aed68047

  • util: enable futex usage on BSDs after 7dc2f4788288

  • meson: unbreak sysctl.h detection on BSDs

  • anv: disable i915_perf warning on non-Linux

Jan Palus (1):

  • targets/opencl: fix build against LLVM>=10 with Polly support

Jan Zielinski (1):

  • gallium/swr: Fix crashes in sampling code

Faith Ekstrand (167):

  • intel/eu: Use non-coherent mode (BTI=253) for stateless A64 messages

  • Revert “anv/gen12: Temporarily disable VK_KHR_buffer_device_address (and EXT)”

  • vulkan: Allow destroying NULL debug report callbacks

  • vulkan,anv: Add a common base object type for VkDevice

  • anv: Stop clflushing events

  • anv: Allocate CPU-side memory for events

  • vulkan,anv: Add a base object struct type

  • vulkan,anv: Move the DEFINE_HANDLE_CASTS macros to vk_object.h

  • anv: Refactor setting descriptors with immutable sampler

  • vulkan: Add run-time object type asserts in handle casts

  • vulkan/wsi: Make wsi_swapchain inherit from vk_object_base

  • anv/allocator: Add a start_offset to anv_state_pool

  • vulkan/object: Always include the type

  • anv,vulkan: Implement VK_EXT_private_data

  • vulkan: Handle vkGet/SetPrivateDataEXT on Android swapchains

  • nir: Make “divergent” a property of an SSA value

  • util/list: Add a list pair iterator

  • util/vma: Add an option to configure high/low preference

  • util/vma: Add a debug print helper

  • util/ra: Add [de]serialization support

  • anv: Set 3DSTATE_VF_INSTANCING on the SVGS element

  • anv: Set MOCS in 3DSTATE_CONSTANT_* on Gen9+

  • nir: Add some docs to the metadata types

  • anv: Call vk_object_base_finish for image views

  • anv: Fix descriptor set clean-up on BO allocation failure

  • nir: Use 8-bit types for most info fields

  • anv:gpu_memcpy: Emit 3DSTATE_VF_INDEXING on Gen8+

  • nir: Validate jump instructions as an instruction type

  • nir: Use a switch statement in nir_handle_add_jump

  • nir: Add documentation for each jump instruction type

  • nir/clone: Re-use clone_alu for nir_alu_instr_clone

  • nir: Add a new helper for iterating phi sources leaving a block

  • nir: Add a store_reg helper and use the builder in phis_to_regs

  • nir: Add const to nir_intrinsic_src_components

  • nir/lower_double_ops: Rework the if (progress) tree

  • nir/opt_deref: Report progress if we remove a deref

  • nir/copy_prop_vars: Record progress in more places

  • nir: Fix sources for image atomic fadd

  • intel/vec4: Stomp the return type of RESINFO to UINT32

  • intel/fs: Fix unused texture coordinate zeroing on Gen4-5

  • intel/fs: Emit HALT for discard on Gen4-5

  • anv/allocator: Compare to start_offset in state_pool_free_no_vg

  • nir: Add a nir_metadata_all enum value

  • nir: Add a nir_shader_preserve_all_metadata helper

  • nir: Call nir_metadata_preserve on !progress

  • nir: Properly preserve metadata in more cases

  • intel/nir: Call nir_metadata_preserve on !progress

  • iris: Better handle metadata in NIR passes

  • anv: Add an anv_batch_set_storage helper

  • anv: Add anv_pipeline_init/finish helpers

  • nir/intrinsics: Put the _intel intrinsics together at the end

  • anv: Use resolve_device_entrypoint for dispatch init

  • vulkan: Update Vulkan XML and headers to 1.2.145

  • anv: Bump the advertised patch version to 145

  • intel/fs: Expose a couple of NIR lowering helpers

  • intel/fs: Break wm_prog_data setup into a helper

  • intel/fs: Move more prog_data setup into populate_wm_prog_data

  • intel/compiler: Expose brw_texture_offset to C

  • intel/eu: Add a brw_urb_dest_msg_type helper

  • intel/eu: Set the right subnr for ALIGN16 destinations

  • intel/eu: Add the RNDU opcode

  • vulkan/wsi: Don’t consider VK_SUBOPTIMAL_KHR to be an error condition

  • wsi/x11: Log swapchain status changes

  • freedreno: Only call nir_lower_io on shader_in/out

  • lima: Only call nir_lower_io on shader_in/out

  • nouveau: Only call nir_lower_io on shader_in/out

  • vc4: Only call nir_lower_io on shader_in/out

  • v3d: Only call nir_lower_io on shader_in/out

  • panfrost: Only call nir_lower_io on shader_in/out

  • nir: Assert that nir_lower_io is only called with allowed modes

  • nir: Remove shared support from lower_io

  • nir: Add docs to nir_lower[_explicit]_io

  • anv: Handle clamping of inverted depth ranges

  • nir/validate: Don’t abort() until after the shader has printed

  • spirv: Skip phis in unreachable blocks in the second phi pass

  • spirv: Allow block-decorated struct types for constants

  • vulkan: Update Vulkan XML and headers to 1.2.148

  • anv: Advertise VK_EXT_image_robustness

  • spirv: Update headers and grammar json

  • spirv: Add support for SPV_EXT_shader_atomic_float

  • intel/fs: Use the correct logical op for global float atomics

  • anv: Advertise support for VK_EXT_shader_atomic_float

  • nir: Allow for system values with variable numbers of destination components

  • nir/lower_io: Choose to set access based on intrinsic metadata

  • nir/lower_io: Use b2b for shader and function temporaries

  • nir/lower_io: Add support for global scratch addressing

  • spirv: Simplify our handling of NonUniform

  • spirv: Drop the void *ptr from vtn_value

  • spirv: Fix indentation in vtn_handle_ptr

  • spirv: Clean up OpSignBitSet

  • spirv: Use nir_bany/ball for OpAny/All

  • spirv: Add a helpers for getting types of values

  • spirv: Rename push_value_pointer to push_pointer

  • spirv: Add a vtn_push_nir_ssa helper

  • spirv/amd: Use vtn_push_nir_ssa

  • spirv: Add a vtn_get_nir_ssa helper

  • spirv: Use the new helpers in OpConvertUToPtr/PtrToU

  • spirv: Refactor vtn_push_ssa

  • spirv/alu: Use vtn_push_ssa_value

  • spirv/glsl450: Use vtn_push_ssa_value

  • spirv/subgroups: Stop incrementing w

  • spirv/subgroups: Refactor to use vtn_push_ssa

  • spirv: Simplify vtn_ssa_value creation

  • spirv: Hand-roll fewer vtn_ssa_value creations

  • spirv: Add better checks for SSA value types

  • spirv: Drop the sampled boolean from vtn_type

  • spirv: Give atomic counters their own variable mode

  • spirv: Add a helper for getting the NIR type of a vtn_type

  • spirv: Remove a dead case in function parameter handling

  • spirv: More heavily use vtn_ssa_value in function parameter handling

  • anv,turnip,radv,clover,glspirv: Run nir_copy_prop before nir_opt_deref

  • spirv: Rework our handling of images and samplers

  • spirv: Also copy over binding information for atomic counters

  • nir: Take a mode in remove_unused_io_vars

  • nir/dead_variables: Respect the modes passed to remove_dead_vars

  • nir: Add nir_foreach_shader_in/out_variable helpers

  • nir: Add a nir_foreach_function_temp_variable helper

  • nir: Add a nir_foreach_uniform_variable helper

  • nir: Add a nir_foreach_gl_uniform_variable helper for GL linking

  • nir: Add and use a nir_variable_list_for_mode helper

  • nir: Take a nir_shader and variable mode in assign_var_locations

  • nir: Take a shader and variable mode in nir_assign_io_var_locations

  • nir/linking: Rework some internal helpers

  • st/nir: Rework fixup_varying_slots

  • nir/split_vars: Add mode checks to list walks

  • nir: Split nir_index_vars into two functions

  • nir/lower_amul: Add a variable mode check

  • nir: Use a nir_shader and mode in lower_clip_cull_distance_arrays

  • nir/lower_io_to_temporaries: Use a separate list for new inputs

  • nir/io_to_vector: Use nir_foreach_variable_with_modes

  • nir/lower_two_sided_color: Use nir_variable_create

  • nir/lower_uniforms_to_ubo: Use nir_foreach_variable_with_modes

  • nir/split_per_member_structs: Use nir_variable_with_modes_safe

  • nir/lower_variable_initializers: Restrict the modes we lower

  • nir/gl_nir_linker: Use nir_foreach_variable_with_modes

  • freedreno/ir3_lower_tess: Rework var list helpers

  • lima/standalone: Rework i/o variable fixup

  • freedreno/ir3_cmdline: Rework i/o variable fixup

  • r600/sfn/lower_tess_io: Rework get_tcs_varying_offset

  • r600/sfn/lower_tex: Get rid of the lower_sampler vector

  • r600/sfn: Use nir_foreach_variable_with_modes in IO vectorization

  • panfrost/midgard: Make search_var take a nir_shader and mode

  • panfrost: Use nir_foreach_variable_with_modes in pan_compile

  • aco: Use nir_foreach_variable_with_modes to walk SSBOs

  • mesa/ptn: Use nir_variable_create

  • gallium/ttn: Use variable create/add helpers

  • nir: Use a single list for all shader variables

  • nir/split_per_member_structs: Inline split_variables_in_list

  • nir/gl_nir_linker: Call add_vars_with_modes once for GL_PROGRAM_INPUT

  • nir: Add a find_variable_with_[driver_]location helper

  • vulkan: Update Vulkan XML and headers to 1.2.149

  • anv: Implement VK_EXT_4444_formats

  • nir/deref: Don’t try to compare derefs containing casts

  • compiler/types: Add a struct_type_is_packed wrapper

  • spirv: Do more complex unwrapping in get_nir_type

  • anv: Advertise shaderIntegerFunctions2

  • spirv: Don’t emit RMW for vector indexing in shared or global

  • clover/spirv: Don’t call llvm::regularizeLlvmForSpirv

  • intel/nir: Pass the nir_builder by reference in lower_alpha_to_coverage

  • intel/nir: Rewrite the guts of lower_alpha_to_coverage

  • intel/fs: Fix MOV_INDIRECT and BROADCAST of Q types on Gen11+

  • intel/fs: Don’t copy-propagate stride=0 sources into ddx/ddy

  • iris: Re-emit push constants if we have a varying workgroup size

  • spirv: Run repair_ssa if there are discard instructions

  • nir: More NIR_MAX_VEC_COMPONENTS fixes

  • intel/fs/swsb: SCHEDULING_FENCE only emits SYNC_NOP

  • radeonsi: Only call nir_lower_var_copies at the end of the opt loop

Jesse Natalie (10):

  • nir_lower_io: Add addr_format_is_offset helper

  • nir: When nir_lower_vars_to_explicit_types is run on temps, update scratch_size

  • nir: Support load/store of temps as scratch in nir_lower_explicit_io

  • nir: Support vec8/vec16 in nir_lower_bit_size

  • nir: Support algebraic opts on vectors larger than 4

  • nir: Support 8 and 16 component vectors for reduceable intrinsics

  • nir/vtn: Add support for 8 and 16 vector ball/bany

  • u_debug_stack_test: Fix MSVC compiling by using ATTRIBUTE_NOINLINE

  • nir: More NIR_MAX_VEC_COMPONENTS fixes

  • glsl_type: Add packed to structure type comparison for hash map

JibbityJobbity (1):

  • drirc: Enable glthread for PCSX2

Jon Turney (1):

  • glthread: Fix use of alloca() without #include “c99_alloca.h”

Jonathan Gray (13):

  • util: unbreak endian detection on OpenBSD

  • util/anon_file: add OpenBSD shm_mkstemp() path

  • meson: build with _ISOC11_SOURCE on OpenBSD

  • meson: don’t build with USE_ELF_TLS on OpenBSD

  • meson: conditionally include -ldl in gbm pkg-config file

  • util: futex fixes for OpenBSD

  • util/u_thread: include pthread_np.h if found

  • anv: use os_get_total_physical_memory()

  • util/os_misc: add os_get_available_system_memory()

  • anv: use os_get_available_system_memory()

  • util/os_misc: os_get_available_system_memory() for OpenBSD

  • radv: remove seccomp includes

  • vulkan: make VK_TIME_DOMAIN_CLOCK_MONOTONIC_RAW_EXT conditional

Jonathan Marek (135):

  • turnip: update “fetchsize” value to match fdl6_layout changes

  • turnip: enable tiling for compressed formats

  • util/format: translate 422_UNORM and 420_UNORM vulkan formats

  • freedreno/registers: document 422_UNORM and 420_UNORM formats

  • turnip: implement VK_KHR_sampler_ycbcr_conversion

  • turnip: enable 422_UNORM formats

  • freedreno: move a4xx specific layout code to a4xx code

  • freedreno/a5xx: remove unused reference to gmem_alignw in layout code

  • freedreno/a6xx: don’t use gmem_alignw for imported buffers

  • freedreno/a6xx: split up gmem/tile alignment requirements

  • freedreno: reduce extra height alignment in a6xx layout

  • freedreno/a6xx: use RESOLVE_TS event

  • freedreno: add adreno 650

  • freedreno/layout: add explicit offset/pitch argument to fdl6_layout

  • turnip: support VkImageDrmFormatModifierExplicitCreateInfoEXT

  • turnip: fix RENDER_COMPONENTS value

  • turnip: move HLSQ_UPDATE_CNTL write to before xs config writes

  • turnip: update some properties based on blob driver

  • turnip: clamp sampler minLod/maxLod

  • freedreno/a6xx: use nonbinning VS when GS is used

  • turnip: correctly emit non-binning vs in transform feedback case

  • turnip: fix HW binning with geometry shader

  • turnip: use common emit_xs_cntl to fill a6xx_sp_xs_ctrl_reg0

  • turnip: fix VFD_CONTROL for binning pass

  • turnip: pipeline program state refactor

  • turnip: share code between 3D blit/clear path and tu_pipeline

  • turnip: add layered 3D path clear for CmdClearAttachments

  • turnip: add emit renderpass cache flushes for sysmem 3D CmdClearAttachments

  • turnip: remove some dead/redundant code

  • freedreno/ir3: fix ir3_nir_move_varying_inputs

  • turnip: remove duplicated stage2opcode and stage2shaderdb

  • turnip: simplify stage2 helpers

  • turnip: set VFD_INDEX_OFFSET in 3D clear/blit path

  • turnip: fix 3D path always being used for CmdBlitImage

  • turnip: fix cubic filtering with CmdBlitImage

  • turnip: compute and graphics have completely separate state

  • turnip: move descriptor set BO tracking to CmdBindDescriptorSets

  • turnip: improve dirty bit handling a bit

  • turnip: delete dead dynamic state code

  • turnip: refactor draw states and dynamic states

  • turnip: input attachment descriptor set rework

  • turnip: use draw states for input attachments

  • turnip: use u_format for packing gmem clear values

  • freedreno/a6xx: FETCHSIZE is PITCHALIGN

  • freedreno/fdl6: rework layout code a bit (reduce linear align to 64 bytes)

  • turnip: fix a crash when rasterizerDiscardEnable is set

  • turnip: fix a sample shading case

  • turnip: fix renderpass gmem configs when there are too many attachments

  • turnip: set the API version

  • turnip: move enum translation functions to a common header

  • freedreno/a6xx: VSC “STRM_ARRAY_PITCH” is “STRM_LIMIT”

  • freedreno/a6xx: remove unnecessary OVERFLOW_FLAG_REG check

  • turnip: remove unnecessary OVERFLOW_FLAG_REG check

  • freedreno/a4xx: restore pitch to bytes change to layout code

  • freedreno/a4xx: simplify setup_slices

  • turnip: rework streamout state and add missing counter buffer read/writes

  • turnip: refactor CmdDraw* functions (and a few fixes)

  • turnip: enable VK_EXT_index_type_uint8

  • turnip: implement CmdDrawIndirectByteCountEXT

  • turnip: fix ts_cs_memory typo

  • turnip: use pipeline cs for shader programs instead of separate bo

  • freedreno/registers: a6xx depth bounds test registers

  • turnip: implement depthBounds

  • turnip: translate CreateRenderPass to CreateRenderPass2

  • turnip: replace a memset(0) with zalloc in CreateRenderPass

  • turnip: use RenderPassCreateInfo for render_pass_add_implicit_deps

  • turnip: move some logic out of create_render_pass_common

  • turnip: implement VK_EXT_vertex_attribute_divisor

  • turnip: fix empty scissor case

  • turnip: fix update_stencil_mask

  • turnip: disable early_z for VK_FORMAT_S8_UINT

  • freedreno/registers: add CP_DRAW_INDIRECT_MULTI

  • freedreno/ir3: add support for load_draw_id

  • turnip: implement VK_KHR_shader_draw_parameters

  • turnip: fix VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_1_FEATURES

  • turnip: fix huge scissor min/max case

  • freedreno/ir3: fix resinfo wrmask

  • freedreno/regs: add extra bits for UBWC array pitch

  • turnip: enable largePoints

  • turnip: enable depthBiasClamp

  • freedreno/registers: update varying-related registers

  • freedreno/a3xx: support LINEAR_PIXEL/PERSP_CENTROID/LINEAR_CENTROID sysvals

  • freedreno/a4xx: fake LINEAR_PIXEL varying support for u_blitter

  • freedreno/ir3: add generic get_barycentric()

  • freedreno/a5xx: set missing bary sysvals

  • freedreno/a6xx: set missing bary sysvals

  • turnip: set missing bary sysvals

  • freedreno/ir3: add support for INTERP_MODE_NOPERSPECTIVE

  • turnip: make tiling config part of framebuffer state

  • turnip: rework render_tiles loop

  • turnip: vsc improvements

  • turnip: fix tess param bo size calculation

  • turnip: clear_blit: pass aspect mask to setup function

  • turnip: support multi-image layouts

  • turnip: enable 420_UNORM formats

  • freedreno/layout: fix explicit layout offset not added to slice offset

  • freedreno/ir3: fix/rework tess levels

  • Revert “nir: Add an option for lowering TessLevelInner/Outer to vecs”

  • Revert “nir: Support sysval tess levels in SPIR-V to NIR”

  • freedreno/regs: document SS6_UBO state src

  • turnip: use global bo for clear blit shaders

  • freedreno/ir3: add support for a650 tess shared storage

  • freedreno/regs: document CS shared storage size bit

  • freedreno/a2xx: fix compressed textures

  • freedreno: add a fd_resource_pitch helper

  • freedreno/layout: layout simplifications and pitch from level 0 pitch

  • turnip: fix active_desc_sets not being set for compute pipeline

  • freedreno/ir3: fix setup_input for sparse vertex inputs

  • freedreno/ir3: run nir_opt_loop_unroll in optimization loop

  • freedreno: fix layout pitchalign field not being set for imported buffers

  • freedreno/regs: update primitive output related registers

  • turnip: clean up primitive output state

  • turnip: drop GS clear path

  • turnip: use DIRTY SDS bit to avoid making copies of pipeline load state ib

  • turnip: emit compute pipeline directly in CmdBindPipeline

  • turnip: fix inconsistencies with tu6_load_state_size

  • turnip: remove use of tu_cs_entry for draw states

  • gitlab-ci: re-enable arm64_a630_vk

  • freedreno/regs: update a6xx GRAS registers

  • freedreno/regs: update a6xx RB regs

  • freedreno/regs: update a6xx VPC regs

  • freedreno/regs: update a6xx PC regs

  • turnip: disable tiling for NV12/IYUV formats

  • turnip: remove extra gmem alignment

  • freedreno/ir3: fix wrong local_primitive_id_start type

  • turnip: move WFI out of draw state to fix a650 hangs

  • turnip: use patchControlPoints for HS_INPUT_SIZE value

  • turnip: fix SP_HS_UNKNOWN_A831 value for A650

  • turnip: workaround for a630 d24_unorm_s8_uint fails

  • turnip: fix sysmem CmdClearAttachments 3D fallback breaking GMEM path flush

  • turnip: delete tu_clear_sysmem_attachments_2d

  • turnip: add support for D32_SFLOAT_S8_UINT

  • turnip: rework extended formats to allow more extended formats

  • util/format: translate A4R4G4B4_UNORM and A4B4G4R4_UNORM vulkan formats

  • turnip: implement VK_EXT_4444_formats

Jordan Justen (17):

  • intel/dev: Split .num_subslices out of GEN12_FEATURES macro

  • intel/dev: Add device info for RKL

  • intel/l3: Don’t rely on cfg entry URB size being 0 as a sentinal

  • intel/l3: Allow platforms to have no l3 configurations

  • iris/l3: Enable L3 full way allocation when L3 config is NULL

  • anv: Set L3 full way allocation at context init if L3 cfg is NULL

  • intel/dev: Add device info for DG1

  • iris: Make use of devinfo has_aux_map field

  • anv: Make use of devinfo has_aux_map field

  • anv/pipeline: Split VFE/INTERFACE_DESCRIPTOR out to emit_media_cs_state

  • anv/cmd_buffer: Split GPGPU_WALKER out to emit_gpgpu_walker

  • iris: Split walker and state update into iris_upload_gpgpu_walker

  • iris/compute: Split out iris_load_indirect_location

  • intel/compiler/cs: Allow simd32 in some more cases with no8 and/or no16

  • intel/compiler/fs: Still attempt simd32 when INTEL_DEBUG=no16 is used

  • iris: Add missing break in switch in modifier_is_supported

  • anv, iris: Set MediaSamplerDOPClockGateEnable for gen12+

Jose Maria Casanova Crespo (4):

  • v3d: Fix swizzle in DXT3 and DXT5 formats

  • v3d: Include supported DXT formats to enable s3tc/dxt extensions

  • vc4: don’t relay on intr->num_components for non-vectorized intrinsics

  • nir: only uniforms with dynamically_uniform offset are dynamically_uniform

Joshua Ashton (7):

  • anv: Remove RANGE_SIZE usage

  • radv: Remove RANGE_SIZE usage

  • turnip: Remove RANGE_SIZE usage

  • vulkan: Update Vulkan XML and headers to 1.2.140

  • radv: Implement VK_EXT_custom_border_color

  • radeonsi: Use TRUNC_COORD on samplers

  • radv: Implement VK_EXT_4444_formats

José Fonseca (3):

  • glthread: Add GLAPIENTRY to _mesa_marshal_MultiDrawArrays.

  • appveyor: Upgrade pip.

  • appveyor: Use Python3.

Karol Herbst (50):

  • nir/deref: copy ptr_stride when rematerializing

  • nir/validate: validate the stride for deref_ptr_as_array

  • Revert “nir/validate: validate the stride for deref_ptr_as_array”

  • nvir/nir: use component helpers instead of insn->num_components

  • st/mesa: lower images when needed

  • nir/lower_images: fix for array of arrays

  • nir/lower_images: handle dec and inc

  • nv50/ir/nir: move away from image_deref intrinsics

  • nv50/ir/nir: handle image atomic inc and dec

  • nv50/ir/nir: remove image uniform hack

  • gv100/ir: fix atom cas

  • gv100/ir: fix shift lowering

  • gv100/ir: fix OP_TXG for shadow textures

  • nv50/ir/nir: add workaround for double vertex attribs

  • nv50/ir/print: add missing VIEWPORT_MASK handling

  • nv50/ir/nir: fix ext_demote_to_helper_invocation

  • nv50/ir/nir: fix nv_viewport_array2

  • nvc0: enable spirv caps with nir

  • nv50/ir/nir: don’t emit a restart with set a stream_id

  • nv50/ir/nir: handle clip vertex for tess eval shaders

  • nv50/ir/nir: rework input output handling

  • nv50/ir/nir: rework CFG handling

  • nv50/ir/ra: convert some for loops to Range-based for loops

  • nv50/ir/ra: fix memory corruption when spilling

  • nv50/ir/nir: fix interpolation on explicit operations

  • gv100/ir: implement sample shading

  • gv100/ir: fix coherent and volatile memory access

  • nv50/ir/nir: fix cache mode conversion

  • nv50/ir: fix memset on non trivial types warning

  • nv50/ir/tgsi: move call to tgsi_scan_shader inside Source constructor

  • nvc0: set local mem size for compute on gv100

  • nvc0: set sampler index mode to independently on gv100 compute

  • gv100/ir: set ftz bit on floating point operations

  • ci: bump libdrm to 2.4.102

  • nouveau: enable HMM

  • gallium: add PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY

  • nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY

  • nouveau: expose HMM

  • ci: need to install wget in order to download libdrm

  • ci: bump libdrm to 2.4.102

  • nouveau: enable HMM

  • gallium: add PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY

  • nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY

  • nouveau: expose HMM

  • st/mesa: fix st_CopyPixels without support for stencil exports

  • nv50/ir/tgsi: silence warning about unhandled GS_INPUT_PRIM property

  • nv50/ir: initialize persampleInvocation to false

  • nir/lower_io: assert that offsets are used for shader_in

  • nv50/ir/nir: fix global_atomic_comp_swap

  • spirv: extract switch parsing into its own function

Kenneth Graunke (20):

  • iris: Include linux/sync_file.h instead of cut and pasting contents

  • anv: Include linux/sync_file.h instead of cut and pasting contents

  • iris: Rename iris_syncpt to iris_syncobj for clarity.

  • iris: Give up on not passing ice to iris_init_batch

  • iris: Destroy transfer slab after batches

  • iris: Flush any current work in iris_fence_await before adding deps

  • intel: Move anv_gem_supports_syncobj_wait to common code.

  • iris: Detect DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT kernel support

  • iris: Implement PIPE_FLUSH_DEFERRED support.

  • intel: Delete hardcoded devinfo->urb.size values for Gen7+ (sans DG1).

  • iris: Delete useless #define

  • intel/eu: Add a brw_urb_desc helper

  • CI: Disable Panfrost Mali-T820, Lima Mali-400 and Lima Mali-450 jobs

  • intel: Disable loading drivers on DG1 devices for now

  • nir: Fix divergence analysis for tessellation input/outputs

  • iris: Implement pipe->texture_subdata directly

  • iris: Fix CCS check in iris_texture_subdata().

  • iris: Delete shader variants when deleting the API-facing shader

  • iris: Reorder the loops in iris_fence_await() for clarity.

  • iris: Drop stale syncobj references in fence_server_sync

Kristian Høgsberg (73):

  • freedreno/ir3: Pass stream output info to ir3_shader_from_nir

  • freedreno/ir3: Rename ir3_nir_lower_to_explicit_io

  • freedreno/ir3: Add ir3_nir_lower_to_explicit_input() pass

  • freedreno/ir3: Lower GS builtins before lowering IO

  • freedreno/ir3: Drop hack to clean up split vars

  • freedreno/fdl: Align after dividing by block size

  • freedreno/a6xx: Set tfetch correctly for compressed formats

  • freedreno/ir3: Drop wrmask for ir3 local and global store intrinsics

  • freedreno/a6xx: Create shader dependent streamout state at compile time

  • freedreno/a6xx: Map inputs to VFD entries up front

  • freedreno/a6xx: Allocate ringbuffer based on VFD count

  • freedreno/a6xx: Emit VFD setup as array writes

  • freedreno/a6xx: Avoid stalling for occlusion queries

  • freedreno: Use the right amount of &’s

  • freedreno: Use explicit *_NONE enum for undefined formats

  • turnip: Use hw enum when emitting A6XX_RB_STENCIL_CONTROL

  • turnip: Use tu6_reduction_mode() to avoid warning

  • turnip: Use {} initializer to silence warning

  • freedreno/ir3: Avoid {0} initializer for struct reginfo

  • src/util: Remove out-of-range comparison

  • mapi: Fix a couple of warning in generated code

  • mesa/st: Use memset to zero out struct

  • egl/android: Move get_format under HAVE_DRM_GRALLOC guard where it’s used

  • egl/android: Drop unused variable

  • freedreno/a6xx: Move per element offset to VFD_DECODE

  • freedreno/a6xx: Decouple VFD_FETCH and VFD_DECODE

  • freedreno/a6xx: Create stateobj for VFD_DECODE

  • freedreno/a6xx: Program VFD_DEST_CNTL from program stateobj

  • freedreno/a6xx: Turn on robustness extensions

  • docs/features.txt: Update for freedreno

  • freedreno/a6xx: Fix VFD_CONTROL emit

  • freedreno/a6xx: Don’t write REG_A6XX_RB_SRGB_CNTL in restore

  • freedreno/a6xx: Set index buffer size to bo size

  • freedreno: Handle DRM_FORMAT_MOD_INVALID in shared code

  • turnip: Put VK_KHR_external_fence_fd stubs back

  • freedreno/a6xx: Don’t blit with R2D_RAW

  • freedreno/a6xx: Move fd6_ifmt into fd6_blitter.c

  • freedreno/a6xx: Split out src and dst setup helpers for blit

  • freedreno/a6xx: Don’t set unknown bit when tiling differs

  • freedreno/a6xx: Set src and dst rects outside blit loop

  • freedreno/a6xx: Program SP_2D_SRC_FORMAT outside blit loop

  • freedreno/a6xx: Consolidate computing blit_cntl

  • freedreno/a6xx: Don’t emit src state when clearing

  • freedreno/a6xx: Separate stencil sysmem clear fix

  • freedreno/a6xx: Enable FMT6_10_10_10_2_UNORM blitting

  • freedreno/a6xx: Make blit_control helper a little more helpful

  • freedreno/a6xx: Program A6XX_SP_2D_SRC_FORMAT_COLOR_FORMAT based on dst format

  • freedreno/a6xx: Move REG_A6XX_SP_2D_SRC_FORMAT programming to helper

  • freedreno/a6xx: Move CP_SET_MARKER to setup helper

  • freedreno/a6xx: Program RB_UNKNOWN_8C01 in setup helper

  • freedreno/a6xx: Don’t take pipe_blit_info in emit_blit_dst

  • freedreno/a6xx: Split clear and blit texture into different functions

  • freedreno/registers: Rename SP_2D_SRC_FORMAT

  • turnip: Move device enumeration and feature discovery to tu_drm.c

  • turnip: Move tu_bo functions to tu_drm.c

  • turnip: Collapse some tu_drm wrappers

  • turnip: Move remaining drm code to tu_drm.c

  • turnip: Only include msm_drm in tu_drm.c

  • egl/android: Remove unused variable

  • mapi/test: Change type to unsigned for offset

  • gallium: Switch u_debug_stack/symbol.c to util/hash_table.h

  • util: Move stack debug functions to src/util

  • util: Add unit test for stack backtrace caputure

  • gallium/android: Rewrite backtrace helper for android

  • ci: Include enough Android headers to let us compile test EGL

  • mapi: Mark TLS symbols as optional in glapi-symbols.txt

  • turnip: Make tu_android.c compile again

  • meson: Define ANDROID and ANDROID_API_LEVEL when compiling for Android

  • anv: Pass device to setup_gralloc0_usage for error reporting

  • anv: Add stub for anv_gem_get_tiling() for Android

  • vulkan: Allow global symbol HMI for Android

  • radv/android: Remove unused variable

  • ci: Add a build test for the Android platform

Krzysztof Raszkowski (1):

  • gallium/swr: Fix building swr with MSVC

Laura Ekstrand (3):

  • docs: include meson in the toctree

  • docs: Remove version.

  • docs: Add the favicon to the new page.

Leo Liu (3):

  • radeon/vcn: reset the decode flags from message buffer

  • radeon/vcn: add Sienna to use internal register offset

  • radeon/vcn/dec: add db_aligned_height to message buffer

Lepton Wu (3):

  • mapi: x86: Fix dynamic entries in x86 tsd stubs.

  • mapi: Return NULL function pointers for GL_EXT_debug_marker

  • egl: Allow software rendering for vgem/virtio_gpu in platform_device

Lionel Landwerlin (60):

  • drm-shim: move handle lock to shim_fd

  • drm-shim: don’t create a memfd per BO

  • drm-shim: silence warnings

  • intel/dev: print out error when platform is not found by name

  • intel: add stub_gpu tool

  • ci: Add intel to shaderdb runs

  • iris: don’t assert on unfinished aux import in copy paths

  • anv: don’t expose VK_INTEL_performance_query without kernel support

  • anv: fix alignments for uniform buffers

  • genxml: run sorting script

  • genxml: fix invalid end value for video fields

  • genxml: factor out utility functions

  • genxml: pack: deal with default field not being simple integers

  • intel/genxml: fix bits generation for MI_LOAD_REGISTER_IMM

  • intel/mi-builder: add framework for self modifying batches

  • anv: don’t reserve a particular register for draw count

  • anv: add a new execution mode for secondary command buffers

  • intel/genxml: add PIPE_CONTROL command cache invalidate bit

  • intel/perf: make pipeline statistic query loading optional

  • intel/perf: store the appropriate OA formats in queries

  • intel/perf: update generated code to ralloc all data

  • intel/perf: create a unique list of counters

  • intel/perf: compute number of passes for a set of counters

  • intel/perf: emit counter units in generated code

  • intel/perf: add helper to compute metrics from counters

  • intel/perf: add counter category to generated code

  • intel/perf: report whether the platform supported

  • anv: use a query filled by the perf code

  • intel/perf: reuse offset specified in the query

  • anv: Implement VK_KHR_performance_query

  • intel/perf: repurpose INTEL_DEBUG=no-oaconfig

  • anv: fixup unwinding of device create failure

  • blorp: rename workaround address function

  • anv: store the workaround address

  • iris: store workaround address

  • i965: store workaround_bo offset

  • intel: add identifier for debug purposes

  • iris: add identifier BO

  • i965: add identifier BO

  • anv: add identifier BO

  • intel/aub_error_decoder: print driver identifier if found

  • iris: fix BO destruction in error path

  • i965: don’t forget to set screen on duped image

  • iris: fix export of GEM handles

  • i965: fix export of GEM handles

  • anv: add an option to disable secondary command buffer calls

  • anv: garbage collect timeline semaphore when querying value

  • iris: fix fallback to swrast driver

  • anv: fix uninitialized variable access

  • anv: properly handle fence import of sync_fd = -1

  • anv: fix descriptor set free

  • anv: fix incorrect realloc failure handling

  • anv: centralize vk to gen arrays

  • anv: fix up dynamic clip emission

  • anv: don’t fail userspace relocation with perf queries

  • anv: fix transform feedback surface size

  • anv: VK_INTEL_performance_query interaction with VK_EXT_private_data

  • intel/perf: store query symbol name

  • intel/perf: fix raw query kernel metric selection

  • intel/compiler: fixup Gen12 workaround for array sizes

Liviu Prodea (1):

  • util: Make process_test path compatible with mingw native toolchains

Louis-Francis Ratté-Boulianne (1):

  • nir: Always create UBO variable when lowering uniforms to ubo

Lucas Stach (3):

  • etnaviv: generalize FE stall before loading shader and sampler states

  • etnaviv: retarget transfer to render resource when necessary

  • etnaviv: don’t expose timer queries

Luigi Santivetti (3):

  • dri2: dri2_make_current() fold multiple if blocks

  • dri2: do not conflate unbind and bindContext() failure

  • egl/dri2: try to bind old context if bindContext failed

Marcin Ślusarz (24):

  • i965: remove unused variable

  • glsl_to_tgsi: add fallthrough comments

  • glsl: cleanup vertex shader input checks

  • iris: remove unused iris_bo->swizzle_mode

  • intel/compiler: fix Android build

  • st/mesa: fix reporting of float perf counters max value

  • iris: return max counter value for AMD_performance_monitor

  • iris: remove iris_monitor_config

  • intel/perf: move query_mask and location out of gen_perf_query_counter

  • iris: propagate error from gen_perf_begin_query to glBeginPerfQueryINTEL

  • i965: propagate error from gen_perf_begin_query to glBeginPerfQueryINTEL

  • util: fix possible fd leaks in os_socket_listen_abstract

  • glsl: catch out of bounds access in the debug version

  • util: fix possible buffer overflow in util_get_process_exec_path

  • util/format: initialize non-important components to 0

  • mesa: fix out of bounds access in glGetFramebufferParameterivEXT

  • mesa: quiet down static analyzers

  • iris: quiet down static analyzers

  • intel/vec4: fix out of bounds read

  • intel/perf: fix performance counters availability after glFinish

  • anv: refresh cached current batch bo after emitting some commands

  • anv: fix minor gen_ioctl(I915_PERF_IOCTL_CONFIG) error handling issue

  • intel/perf: split load_oa_metrics

  • intel/perf: export performance counters sorted by [group|set] and name

Marek Olšák (226):

  • mesa: optimize glPush/PopClientAttrib by removing malloc overhead

  • mesa: don’t call _mesa_update_state for _mesa_get_clamp_fragment_color

  • mesa: don’t set unnecessary program flags in _mesa_update_state

  • mesa: don’t update shaders on fixed-func state changes if user shaders are bound

  • mesa,st/mesa: add a fast path for non-static VAOs

  • mesa: inline vbo_context inside gl_context to remove vbo_context dereferences

  • mesa: add glInternalBufferSubDataCopyMESA for glthread

  • mesa: add _mesa_InternalBind{ElementBuffer,VertexBuffers} for glthread

  • glthread: do glBufferSubData as unsynchronized upload + GPU copy

  • glthread: don’t use atomics for refcounting to decrease overhead on AMD Zen

  • glthread: track pointers and strides for Pointer & EXT_dsa attrib functions

  • glthread: track instance divisor changes

  • glthread: track primitive restart state

  • glthread: initialize VAOs properly

  • glthread: handle POS vs GENERIC0 aliasing

  • glthread: handle gl{Push,Pop}ClientAttrib{DefaultEXT} for glthread states

  • glthread: upload non-VBO vertices and indices for non-Indirect non-IBM draws

  • tgsi_to_nir: handle TGSI_SEMANTIC_BLOCK_SIZE

  • tgsi_to_nir: handle TGSI_OPCODE_BARRIER

  • radeonsi: unify and align down the max SSBO/TBO/UBO buffer binding size

  • radeonsi: clean up and deduplicate code around internal compute dispatches

  • radeonsi: bind shader images after DCC is disabled for image stores

  • radeonsi: add SI_IMAGE_ACCESS_DCC_OFF to ignore DCC for shader images

  • radeonsi: implement and use compute-based DCC decompression on gfx9-10

  • radeonsi: add a workaround to fix KHR-GL45.texture_view.view_classes on gfx9

  • radeonsi: fix si_compute_clear_render_target with render condition enabled

  • radeonsi: revert an accidental change in si_clear_buffer

  • Revert “ac/surface: remove RADEON_SURF_TC_COMPATIBLE_HTILE and assume it’s always set”

  • Revert “ac: reassociate FP expressions for inexact instructions for radeonsi”

  • ac/surface: fix MSAA crash with FORCE_SWIZZLE_MODE on gfx9

  • radeonsi: don’t wait for idle at the end of gfx IBs

  • ac/surface: unset RADEON_SURF_TC_COMPATIBLE_HTILE if HTILE hasn’t been computed

  • radeonsi/gfx9: always use IMG_DATA_FORMAT_S8_32 for 8-bit stencil

  • radeonsi: allow tc_compatible_htile to be mutable

  • radeonsi: enable TC-compatible HTILE on demand for best Z/S performance

  • tgsi_to_nir: translate non-vec4 image stores correctly

  • radeonsi: fix compilation of monolithic PS

  • amd: update amdgpu_drm.h

  • amd: remove duplicated definitions from amdgpu_drm.h

  • amd: assume CMASK is always rb/pipe_aligned, remove ac_surface.u.gfx9.cmask

  • amd: assume HTILE is always rb/pipe_aligned, remove ac_surface.u.gfx9.htile

  • ac/surface,radeonsi: move the set/get_bo_metadata code to ac_surface.c

  • ac/surface,radeonsi: move the set/get_umd_metadata code into ac_surface.c

  • amd: unify code for overriding offset and stride for imported buffers

  • ac/surface: override all offsets including metadata offsets

  • ac/surface: fix broken pitch override on gfx8

  • gallium: rename ‘state tracker’ to ‘frontend’

  • gallium: change comments to remove ‘state tracker’

  • gallium: rename PIPE_RESOURCE_FLAG_ST_PRIV to FRONTEND_PRIV

  • gallium: remove more “state tracker” occurences

  • radeonsi: also enable tgsi_to_nir caching for compute shaders

  • glthread: stop using GLenum16 to get correct GL errors for out-of-bounds enums

  • radeonsi: don’t expose 16xAA on chips with 1 RB due to an occlusion query issue

  • ac/nir: honor ACCESS_STREAM_CACHE_POLICY for L1 and L0 caches too

  • radeonsi: use correct clear value size for EQAA in expand_fmask

  • radeonsi: optimize access pattern for compute blits with linear textures

  • radeonsi: tweak clear/copy_buffer limits when to use compute

  • radeonsi: simplify setting resource usage for si_init_temp_resource_from_box

  • radeonsi: rename SI_RESOURCE_FLAG_TRANSFER to FORCE_LINEAR

  • radeonsi: use vi_dcc_enabled instead of using tex->surface.dcc_offset directly

  • radeonsi: use display_dcc_offset for setting displayable_dcc_cb_mask

  • winsys/amdgpu: add RADEON_FLAG_UNCACHED for faster blits over PCIe

  • radeonsi: disable the L2 cache for most CPU mappings of textures

  • radeonsi: disable the L2 cache for CPU read mappings of buffers

  • radeonsi: compute perf tests - don’t test 1 wave/SA limit, test no limit first

  • radeonsi: test uncached clear/copy buffer performance with compute shaders

  • gallium/u_threaded: execute transfer_unmap with THREAD_SAFE directly

  • ac/gpu_info: compute the best safe IB alignment

  • ac/surface: don’t compute single-sample CMASK if it’s unaligned

  • radeonsi: don’t use INDIRECT_BUFFER within IBs

  • radeonsi: decrease the max GS invocation count to 32

  • Revert “radeonsi: don’t wait for idle at the end of gfx IBs”

  • ac: update register and packet definitions for preemption

  • radeonsi: move resetting tracked registers into a new function

  • radeonsi: split si_all_descriptors_begin_new_cs and rename functions

  • radeonsi: don’t enable TC-compatible HTILE for stencil if stencil doesn’t use it

  • radeonsi/gfx8: enable TC-compatible HTILE from the beginning as before

  • radeonsi: don’t hardcode most perf counter block counts

  • ac/gpu_info: replace num_good_cu_per_sh with min/max_good_cu_per_sa

  • amd: replace SH -> SA (shader array) in comments

  • radeonsi/gfx10: implement most performance counters

  • glthread: don’t upload for glDraw inside a display list and always sync

  • nir: add i2imp and u2ump opcodes for conversions to mediump

  • nir: add int16 and uint16 type helpers

  • nir: lower int16 and uint16 in nir_lower_mediump_outputs

  • nir: fix lower_wpos for 16-bit fddy

  • nir: add options::vectorize_vec2_16bit to limit vectorization to vec2 16

  • glsl: treat lowp as mediump when lowering builtins

  • glsl: handle int16 and uint16 types and add instructions for mediump

  • glsl: lower mediump integer types to int16 and uint16

  • glsl: lower mediump partial derivatives

  • glsl: lower the precision of imageLoad

  • glsl: lower samplers with highp coordinates correctly

  • gallium: add shader caps INT16 and FP16_DERIVATIVES

  • ac: rename has_double_rate_fp16 -> has_packed_math_16bit

  • ac/nir: use more types from ac_llvm_context

  • ac/nir: support vector types in the type suffix of overloaded intrinsics

  • ac/nir: remove type and num_channels args from ac_build_buffer_store_common

  • ac/nir: support 16-bit data in buffer_load_format opcodes

  • ac/nir: support 16-bit data in image opcodes

  • ac/nir: handle nir_op_[fiu]2[fiu]mp opcodes

  • ac/nir: select v_cvt_pkrtz for all conversions from f32 to f16 for radeonsi

  • ac/nir: set the second v_cvt_pkrtz argument to undef if it’s unused

  • ac/nir: support v2f16 derivatives

  • nir: don’t count samplers and images in interface blocks

  • nir: gather which images are buffers

  • nir: gather which images are MSAA

  • radeonsi: remove unused leftover code for INDIRECT_BUFFER inside IBs

  • radeonsi: remove const_buffers_declared hacks

  • radeonsi: pass at most 3 images and/or shader buffers via user SGPRs for compute

  • radeonsi: add a hack to disable TRUNC_COORD for shadow samplers

  • gallium/u_vbuf: get rid of some pointer dereferences

  • gallium/u_vbuf: add a faster path for uploading non-interleaved attribs

  • glthread: sync in glFlush for multiple contexts

  • radeonsi: enable ARB_sparse_buffer

  • ac,radeonsi: replace == GFX10 with >= GFX10 where it’s needed

  • ac,radeonsi: start adding support for gfx10.3

  • ac/surface: add displayable DCC code for gfx10.3

  • radeonsi: honor a user-specified pitch on gfx10.3

  • radeonsi: enable larger SDMA clears and copies on gfx10.3

  • radeonsi: implement R9G9B9E5 render target and image store support on gfx10.3

  • radeonsi: move L2_CACHE_CONTROL registers into si_emit_framebuffer_state

  • radeonsi: set BIG_PAGE fields on gfx10.3

  • radeonsi: don’t set any XNACK options on gfx10.3

  • ac: align num_vgprs for gfx10.3

  • radeonsi: add support for Sienna Cichlid

  • radeonsi: require LLVM 11 for gfx10.3

  • ac/surface: don’t recompute the DCC retile map for imported textures

  • amd/addrlib: don’t recompute DCC info for every ComputeDccAddrFromCoord call

  • amd/addrlib: remove unused members of ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT

  • ac/surface: add a wrapper structure to hold ADDR_HANDLE

  • ac/surface: cache DCC retile maps (v2)

  • amd/addrlib: fix the C++ one definition rule violation

  • ac/surface: don’t set is_displayable if displayable DCC is missing

  • ac/surface: require that gfx8 doesn’t have DCC in order to be displayable

  • ac/surface: enable DCC for the first level in the mip tail on gfx10

  • ac/surface: don’t free dcc_retile_map on failure

  • radeonsi: compact MRTs to save PS export memory space

  • ac/nir: fix 64-bit division for GL CTS

  • glapi: fix incorrect param names in ARB_vertex_attrib_binding functions

  • glthread: rename non_vbo_attrib_mask -> user_buffer_mask, attribs -> buffers

  • glthread: handle ARB_vertex_attrib_binding

  • radeonsi: don’t wait for idle at the end of gfx IBs

  • radeonsi: replace ctx->screen with sscreen in si_flush_gfx_cs

  • glsl,driconf: add allow_glsl_120_subset_in_110 for SPECviewperf13

  • driconf: add workarounds for SPECviewperf13

  • amd: add proper definitions for NOP packets

  • ac,winsys/amdgpu: align IBs the same as the kernel

  • radeonsi: don’t add the border color buffer into the init_config state

  • radeonsi: rename init_config states to cs_preamble states

  • radeonsi: don’t add the tess ring buffers into the cs_preamble state

  • radeonsi: make wait_mem_scratch unmappable

  • radeonsi: disallow adding BOs into si_pm4_state except 1 shader BO per state

  • radeonsi: make si_pm4_cmd_begin/end static and simplify all usages

  • radeonsi: clear per-context buffers at the end of si_create_context

  • radeonsi: remove tabs

  • radeonsi: don’t flush in fence_server_sync

  • ac/gpu_info: fix num_physical_sgprs_per_simd for gfx10

  • radeonsi: fix NGG culling for Wave64

  • radeonsi: always use Wave32 for GS fast launch, because Wave64 hangs

  • radeonsi: always use Wave64 for HS/GS/VS shader stages (except GS fast launch)

  • radeonsi: don’t try to enable NGG culling for GS

  • radeonsi: add a debug option to enable NGG culling for tessellation

  • glsl: make print_type non-static for debugging

  • glsl: print precision qualifiers in IR dumps

  • glsl: print constant initializers

  • glsl: fix the type of ir_constant_data::u16

  • glsl: fix evaluating float16 constant expression matrices

  • glsl: run validate_ir_tree if GLSL_VALIDATE=1 regardless of the build config

  • glsl: validate more stuff

  • glsl: convert reusable lower_precision util code into helper functions

  • glsl: remove the return type from lower_precision

  • glsl: cleanups in lower_precision

  • glsl: flatten a tautological conditional in lower_precision

  • glsl: don’t lower precision of textureSize

  • glsl: don’t lower builtins to mediump that don’t allow it

  • glsl: lower builtins to mediump that ignore precision of certain parameters

  • glsl: lower builtins to mediump that always return mediump or lowp

  • glsl: add capability to lower mediump array types

  • glsl: lower mediump temporaries to 16 bits except structures (v2)

  • gallium: add PIPE_SHADER_CAP_GLSL_16BIT_TEMPS for LowerPrecisionTemporaries

  • Revert “ac/surface: require that gfx8 doesn’t have DCC in order to be displayable”

  • glsl: don’t validate array types in ir_dereference_variable

  • radeonsi: prevent a gfx10_ngg_calculate_subgroup_info failure for TES+NGG GS

  • radeonsi: add missing initialization of registers

  • radeonsi/gfx10: set the correct value for OFFCHIP_BUFFERING

  • radeonsi: sort registers in si_emit_initial_compute_regs according to GPU gen

  • radeonsi: sort registers in si_init_cs_preamble_state according to GPU gen

  • ac: add helper ac_get_register_name

  • ac: add tables for CP register shadowing

  • winsys/amdgpu: make amdgpu_bo_unmap non-static

  • radeonsi: make cs_preamble_state optional

  • radeonsi: reorder code in update_gs_ring_buffers and init_tess_factor_ring

  • radeonsi: implement CP register shadowing

  • radeonsi: add reg shadowing codepaths to GS and tess ring setup

  • radeonsi: add debug code for register shadowing

  • radeonsi: don’t restore states at the beginning of IBs if they’re shadowed

  • radeonsi: set up IBs for preemption

  • radeonsi: enable preemption if the kernel enabled it

  • amd: rename SIENNA -> SIENNA_CICHLID

  • amd: add support for Navy Flounder

  • amd: enable displayable DCC for everything newer than Navi1x

  • radeonsi: disable SDMA on gfx9

  • radeonsi: reorder NIR optimizations

  • radeonsi: call nir_split_array_vars/shrink_vec_array_vars/opt_find_array_copies

  • glsl: lower_precision - fix assertion failure with dereferences of constants

  • glsl: fix constant expression evaluation for 16-bit types

  • glsl: don’t lower atomic functions to mediump

  • glsl: don’t create conversion opcodes for array types

  • glsl: don’t lower to mediump for desktop OpenGL

  • glsl: improve precision determination for calls

  • Revert “radeonsi: honor a user-specified pitch on gfx10.3”

  • radeonsi: use correct wave size in gfx10_ngg_calculate_subgroup_info

  • radeonsi: use the same units for esgs_ring_size and ngg_emit_size

  • radeonsi: increase minimum NGG vertex count requirement per workgroup on gfx 10.3

  • radeonsi: fix applying the NGG minimum vertex count requirement

  • radeonsi: don’t count unusable vertices to the NGG LDS size

  • radeonsi: add a common function for getting the size of gs_ngg_scratch

  • radeonsi: remove the NGG hack decreasing LDS usage to deal with overflows

  • radeonsi: various fixes for gfx10.3

  • radeonsi: disable NGG culling on gfx10.3 because of hangs

  • st/mesa: don’t generate NIR for ARB_vp/fp if NIR is not preferred

  • radeonsi: fix tess levels coming as scalar arrays from SPIR-V

  • gallivm: fix build on LLVM 12 due to LLVMAddConstantPropagationPass removal

  • ac/llvm: fix unaligned VS input loads on gfx10.3

  • Revert “ac: generate FMA for inexact instructions for radeonsi”

Marek Vasut (3):

  • etnaviv: Disable seamless cube map on GC880

  • etnaviv: Remove etna_resource_get_status()

  • etnaviv: Add lock around pending_ctx

Mario Kleiner (1):

  • vulkan/wsi: Really terminate DRM lease in wsi_release_display().

Mathias Fröhlich (2):

  • st/mesa: Move _NEW_FRAG_CLAMP to NewFragClamp driver flag.

  • mesa: set _NEW_FRAG_CLAMP only when needed

Matt Turner (22):

  • intel/compiler: Drop opt_sampler_eot()

  • intel/tools: Remove unnecessary reg number checking

  • intel/tools: Drop srctype from ipreg

  • intel/tools: Require explicit regions/types for special regs

  • intel/tools: Disallow control subregisters > 3

  • intel/tools: Add assembler tests for the cr0 register

  • intel/compiler: Add assert that set bits are within mask

  • intel/compiler: Don’t emit no-op cr0 changes

  • intel/tools: Fix typos

  • intel/tools: Remove stray newline

  • intel/tools: Don’t allow empty type specifier

  • intel/tools: Simplify register type handling

  • intel/tools: Make swizzle an integer

  • intel/tools: Make writemask an integer

  • intel/tools: Simplify immediate handling

  • intel/tools: Simplify dstregion

  • intel/compiler: Relax SENDS regioning assertions

  • intel/tools: Pass integers, not enums, to stride()

  • intel/tools: Manually set ARF register file/nr/subnr

  • intel/tools: Don’t hardcode notification register

  • intel/tools: Simplify notification register handling

  • intel/tools: Test notification subregisters

Mauro Rossi (17):

  • android: iris: add iris_seqno.{c,h} to Makefile.sources

  • freedreno/drm: android: add libfreedreno_registers static dependency

  • freedreno: android: add adreno-pm4-pack.xml.h generation to android build

  • android: util: fix build for GL4.1 support

  • android: svga: fix build for GL4.1 support

  • android: aco: add aco_ir.cpp to Makefile.sources

  • android: nvir/gv100: update sources in Makefile.sources

  • android: freedreno: add fd5_layout.c to Makefile.sources

  • android: freedreno/ir3: add missing generated sources and rules

  • android: freedreno/ir3: simplify generated sources rules

  • android: panfrost/encoder: add libmesa_nir static dependency

  • radv: fix build on Android 7 (v2)

  • android: freedreno/registers: fix generated headers rules

  • android: freedreno/ir3: fix include paths

  • android: freedreno/common: add support for libfreedreno_common static

  • android: freedreno: move a2xx disasm out of gallium

  • android: freedreno/common: add libmesa_git_sha1 static dependency

Michel Dänzer (38):

  • gitlab-ci: Use YAML anchor for llvmpipe paths in virgl rules

  • gitlab-ci: Update to current templates

  • gitlab-ci: Move down container_pre_build.sh invocation in x86_build.sh

  • gitlab-ci: Add Debian testing repository for x86_build image

  • gitlab-ci: Install WINE from Debian testing

  • gitlab-ci: Move lib{drm,pciaccess}-dev cross packages out of loop

  • gitlab-ci: Install g++-mingw-w64-x86-64-win32 instead of mingw-w64

  • Revert “ac,radeonsi: fix compilations issues with LLVM 11”

  • Revert “gallium/gallivm: fix compilation issues with llvm 11”

  • gitlab-ci: Enable -Werror in meson-s390x job

  • gitlab-ci: Also list arm/x86_build in needs: of test jobs

  • gitlab-ci: x86_test-base image as common base for x86_test-gl/vk

  • gitlab-ci: Pull in GCC 9 from Debian testing in x86_test-gl/vk images

  • gitlab-ci: Move LLVM/clang 6/7 packages to the x86_build_old image

  • gitlab-ci: Use Debian 10 wine-development packages

  • gitlab-ci: Stop using packages from Debian testing

  • gitlab-ci: Move meson back to x86_test-gl/vk ephemeral packages lists

  • gitlab-ci: Add x86_build-base docker image

  • gitlab-ci: Use separate docker images for cross builds

  • loader/dri3: Add dri3_wait_for_event_locked full_sequence out parameter

  • loader/dri3: Use dri3_wait_for_event_locked in loader_dri3_wait_for_msc

  • loader/dri3: Check for window destruction in dri3_wait_for_event_locked

  • gitlab-ci: Automatically run pipelines for Marge Bot pre-merge only

  • gitlab-ci: Use rules: instead of except:/only: for test-docs job

  • gitlab-ci: Extend .ci-run-policy template for docs jobs

  • gitlab-ci: Do not create the “success” job when the test-docs job exists

  • ci: Use “when: always” for pages job

  • ci: Move deploy stage between container & build stages

  • Revert “loader/dri3: Check for window destruction in dri3_wait_for_event_locked”

  • gitlab-ci: Remove indirect dependencies from needs:

  • gitlab-ci: Drop dependencies:

  • Revert https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4580

  • gitlab-ci: Fix “triggered by Marge for a merge request” rule

  • gitlab-ci: Only trigger test-docs job automatically for MRs

  • ci: Use FDO_CI_CONCURRENT in run-shader-db.sh as well

  • ci: Do not mark container / pages jobs as interruptible

  • ci: Use half as many parallel softpipe / virgl test jobs

  • ci: Use ignore_scheduled_pipelines anchor in .radeonsi-rules

Michel Zou (1):

  • swr: fix build with mingw

Mike Blumenkrantz (73):

  • zink: explicitly zero some arrays in ntv

  • zink: add SpvId returns to a couple ntv functions

  • zink: flush active queries on destroy and free query object

  • zink: fix vkCmdResetQueryPool usage

  • zink: reset query on-demand when beginning a new query from resume

  • zink: always use logical eq ops in ntv with 1bit inputs

  • zink: track program usages for each shader

  • zink: emit interpolation decorations for ntv outputs

  • zink: handle more glsl->spirv builtin translation

  • zink: rework input/output location emission

  • zink: use ‘2’ variants for device props/feats, check features for ext enabling

  • zink: add spirv builder util functions for emitting xfb decorations

  • zink: add spirv_builder methods for OpVectorExtractDynamic and OpVectorInsertDynamic

  • zink: implement streamout and xfb handling in ntv

  • zink: implement transform feedback support to finish off opengl 3.0

  • zink: set PIPE_CAP_VIEWPORT_TRANSFORM_LOWERED and remove POS special casing

  • zink: switch to passing VkPhysicalDeviceFeatures2 in VkDeviceCreateInfo

  • zink: enable xfb extension in screen creation

  • zink: use int assignment for vk int type

  • zink: use correct define value for reserved slot count in ntv

  • zink: clamp VkImageCreateInfo.arrayLayers to 1 for image resource creation

  • zink: unify code for setting resource barriers

  • zink: handle signed and unsigned min/max ops in ntv

  • zink: add ult handling for ntv

  • zink: add bitfield_reverse handling to ntv

  • zink: lower byte/word extract ops in nir

  • zink: handle ixor in ntv

  • zink: handle isign alu in ntv

  • zink: set lower_mul_high and lower_rotate in ntv compiler options

  • zink: use OpFUnordNotEqual for nir_op_fne

  • zink: set lower_uadd_carry in nir options

  • zink: implement Vk_EXT_index_type_uint8

  • nir: add lowering pass for clip plane enabling

  • st/program: use nir_lower_clip_disable instead of nir_lower_clip_vs conditionally

  • nir: add lowering pass for fragcolor -> fragdata

  • zink: translate gl_FragColor to gl_FragData before ntv to fix multi-rt output

  • u_prim_restart: handle user buffers in util_translate_prim_restart_ib()

  • nir: allow nir_lower_point_size_mov to run in geometry shader

  • nir: allow nir_lower_clip_halfz to run in geometry shaders

  • zink: rework query handling

  • zink: use #define for number of queries per-pool

  • zink: only stall during query destroy for xfb queries

  • zink: properly handle query pool overflows

  • zink: only reset query pool on query end if current batch isn’t in renderpass

  • zink: use right vulkan type for GL_PRIMITIVES_GENERATED queries

  • zink: handle ntv case of nested loop instructions more permissively

  • zink: add lengthy comment and remove assert from discard_if ntv pass

  • zink: use type of src[0] for ntv store and load ops

  • zink: try copy_region hook for blits where we can’t do a regular blit or resolve

  • zink: block vkCmdBlitImage usage for multi sampled blits

  • zink: block resolve blits for depth/stencil buffers

  • zink: handle empty attachments

  • zink: try to handle multisampled null buffers

  • zink: enable tgsi texcoord pipe cap

  • zink: destroy gfx program when a shader is freed

  • zink: destroy descriptor pools on context destroy

  • zink: free pipeline cache during program destroy

  • zink: free all ntv allocations after creating shader module

  • zink: use helper function to handle uvec/bvec types

  • zink: handle texelFetchOffset with offsets

  • zink: add some asserts for building access chains in ntv

  • zink: omit Lod image operand in ntv when not using an image texture dim

  • nir: allow lower_psiz_mov to run in tessellation stages

  • nir_ allow nir_lower_clip_halfz to run in tess eval shader

  • u_prim_restart: handle indirect draws

  • zink: add extension loading framework for spirv builder

  • zink: implement VK_EXT_robustness2

  • zink: clamp PIPE_SHADER_CAP_MAX_SHADER_BUFFERS to PIPE_MAX_SHADER_BUFFERS

  • zink: handle VK_EXT_vertex_attribute_divisor setup

  • zink: store valid timestamp bits onto zink_screen

  • zink: implement handling for VK_EXT_calibrated_timestamps

  • u_prim_restart: add inline function for getting restart index based on index size

  • zink: reorder create_stream_output_target to fix failure case leak

Miklós Máté (1):

  • docs: add some missing stuff to sourcetree.rst

Nanley Chery (18):

  • iris: Drop can_fast_clear_color’s format parameter

  • iris: Remove the CCS_D fallback

  • iris: Avoid fast-clear with incompatible view

  • iris: Disable sRGB fast-clears for non-0/1 values

  • intel: Add ISL_AUX_USAGE_GEN12_CCS_E

  • iris: Don’t support sRGB + Y_TILED_CCS on gen9

  • iris: Use ISL_AUX_USAGE_GEN12_CCS_E on gen12

  • isl/drm: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS

  • gallium/dri2: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS

  • iris: Handle importing aux-enabled surfaces on TGL

  • iris: Refactor modifier_is_supported for gen12

  • iris: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS

  • iris: Zero the add-on clear color BO on import

  • dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_B8G8R8X8_UNORM

  • iris: Don’t call SET_TILING for dmabuf imports

  • gallium/dri2: Report correct YUYV and UYVY plane count

  • iris: Fix aux assertion in resource_get_handle

  • blorp: Fix alignment test for HIZ_CCS_WT fast-clears

Nataraj Deshpande (3):

  • anv: Limit vulkan version to 1.1 for Android

  • anv: Disable extensions based on Android versions

  • dri_util: Update internal_format to GL_RGB8 for MESA_FORMAT_R8G8B8X8_UNORM

Neha Bhende (6):

  • util: Initialize pipe_shader_state for passthrough and transform shaders

  • util: Add util functionality for GL4.1 support

  • winsys/drm: Add GL4.1 support in drm winsys

  • svga/include: Headers for GL4.1 support

  • svga: Add GL4.1(compatibility profile) support in svga driver

  • svga: Performance fixes

Neil Armstrong (2):

  • Revert “CI: Disable Lima jobs due to lab unhealthiness”

  • Revert “CI: Disable Panfrost Mali-T820 jobs”

Neil Roberts (26):

  • nir/scheduler: Handle nir_intrinsic_load_per_vertex_input

  • v3d: Remove unused member of v3d_compile

  • nir/schedule: Store a pointer to the scoreboard in nir_deps_state

  • nir/scheduler: Add an option to specify what stages share memory for I/O

  • v3d: Let scheduler know GS doesn’t have shared I/O memory

  • gallium: Add pipe cap for primitive restart with fixed index

  • mesa: Add PrimitiveRestartFixedIndex to gl_constants

  • v3d: Disable PIPE_CAP_PRIMITIVE_RESTART

  • v3d: Add missing macro for stvpmd instruction

  • v3d: Use stvpmd for non-uniform offsets in GS

  • compiler: Add a system value for the line coord

  • v3d: Implement the line coord intrinsic

  • nir: Add intrinsics for the line width

  • v3d: Handle the line width intrinsics

  • v3d: Add a lowering pass for line smoothing

  • v3d: Enable perpendicular line caps when line smoothing

  • broadcom/qpu: set VC5_QPU_RADDR_A out of the switch at _pack_branch

  • v3d/compiler: Fix sorting the gs and fs inputs

  • v3d/compiler: Lower geometry output store base into offset src

  • nir/scheduler: Move nir_scheduler to its own header

  • nir/schedule: Store a pointer to the options struct in scoreboard

  • nir/schedule: Add a callback for backend-specific dependencies

  • v3d: Mark scheduling dependency for prim id and first output

  • nir/schedule: Add an option for a fallback scheduling algorithm

  • v3d: Changed v3d_compile:failed to an enum

  • v3d: Retry with the fallback scheduler when RA fails

Oschowa (5):

  • radv: Don’t take absolute value of unsigned type.

  • aco: Don’t declare ‘Block’ as class, but define as struct.

  • aco: Don’t std::move temporary object.

  • aco: Use correct reference type in for-range-loop.

  • radv: Explicitly cast TIMESTAMP_NOT_READY value to uin32_t where needed.

Pablo Saavedra (5):

  • ci: TRACES_DB_PATH and RESULTS_PATH defined as relative paths

  • ci: ArgumentParser receives the args from the main parameters

  • ci: Migrate tracie tests done in shell script to pytest

  • ci: Split test_tracie_skips_traces_without_checksum in separate cases

  • ci: Fix TypoError error when traces in traces.yml is an empty list

Pavel Asyutchenko (1):

  • vulkan/overlay: fix crash on destroying NULL swapchain

Peter Seiderer (3):

  • vc4_bufmgr: fix time_t printf

  • pan_bo.h: add time.h include for time_t

  • v3d_bufmgr: fix time_t printf

Pierre Moreau (4):

  • clover/nir: Check the result of spirv_to_nir

  • clover/api: Address missing braces for subobj init

  • clover: Address unnecessary copy warnings

  • clover/spirv: Remove unused tuple header

Pierre-Eric Pelloux-Prayer (62):

  • radeonsi: fix export count

  • mesa: add gl_coontext::ForceIntegerTexNearest

  • driconf: add force_integer_tex_nearest option

  • radeonsi: add workaround for issue 2647

  • radeonsi: don’t print gs_copy_shader stats for shaderdb

  • glsl: init gl_FragColor if zero_init=true

  • glsl: rework zero initialization

  • glsl: add a is_implicit_initializer flag

  • mesa: extend GLSLZeroInit semantics

  • gallium: add a new cap PIPE_CAP_GLSL_ZERO_INIT

  • ac/nir: export some undef as zero

  • ac/surface: remove shadowing declaration

  • amdgpu/radeon: add secure api

  • radeonsi: add AMD_DEBUG=tmz option

  • radeon: add RADEON_CREATE_ENCRYPTED flag

  • radeonsi: allocate framebuffer texture as secure when using tmz

  • amdgpu: add encrypted slabs support

  • radeonsi: force using staging texture when uploading to secure texture

  • radeonsi/sdma: implement tmz support

  • gallium: PIPE_RESOURCE_FLAG_ENCRYPTED

  • radeonsi: add support for PIPE_RESOURCE_FLAG_ENCRYPTED

  • amdgpu: use AMDGPU_IB_FLAGS_SECURE when requested

  • radeonsi: determine secure flag must be set for gfx IB

  • radeonsi: do not use cmask with encrypted texture

  • amd/addrlib: fix forgotten char -> enum conversions

  • radeonsi: fix inversed arguments in si_test_gds_memory_management

  • amdgpu: fix unitialized variable

  • radeonsi/sdma: remove useless compare

  • radeonsi/drirc: enable zerovram option for 7 Days to Die

  • winsys/radeon: do not cast bo->va as void*

  • radeonsi: add return value to gfx10_ngg_calculate_subgroup_info

  • radeonsi/ngg: try GS multi-cycling mode if default mode failed

  • ac/surface: set SCANOUT if surf->is_displayable

  • ac/surface: fix epitch when modifying surf_pitch

  • ac/llvm: load 1 byte at a time if unaligned on gfx10

  • st/mesa: make texture views inherit compressed_data storage

  • radeonsi: bump SI_NUM_SHADER_BUFFERS to 32

  • st/mesa: do not clear NewDriverState for inactive states

  • glsl: reject size1x8 for image variable with floating-point data types

  • ac/llvm: remove the -1 hack from ac_atomic_inc_wrap

  • glsl: don’t expose imageAtomicIncWrap for signed image

  • glsl: only allow 32 bits atomic operations on images

  • glsl: declare gl_Layer/gl_ViewportIndex/gl_ViewportMask as vs builtins

  • st/mesa: set compressed_data to NULL when freed

  • bin/symbols-check.py: add –ignore-symbol argument

  • ac/llvm: export ac_init_llvm_once in targets

  • mesa: rename _mesa_free_errors_data

  • mesa: add bool param to _mesa_free_context_data

  • mesa/st: release debug_output after destroying the context

  • ac/surface: adapt surf_size when modifying surf_pitch

  • radeonsi: adjust epitch for PIPE_FORMAT_R8G8_R8B8_UNORM

  • radeonsi: extend workaround for KHR-GL45.texture_view.view_classes on gfx9

  • ac/llvm: handle static/shared llvm init separately

  • mesa/st: introduce PIPE_CAP_NO_CLIP_ON_COPY_TEX

  • radeonsi: enable PIPE_CAP_NO_CLIP_ON_COPY_TEX

  • ac/llvm: add option to clamp division by zero

  • radeonsi,driconf: add clamp_div_by_zero option

  • radeonsi: use radeonsi_clamp_div_by_zero for SPECviewperf13, Road Redemption

  • glsl: fix per_vertex_accumulator::fields size

  • r600/uvd: set dec->bs_ptr = NULL on unmap

  • radeon/vcn: set dec->bs_ptr = NULL on unmap

  • mesa: fix glUniform* when a struct contains a bindless sampler

Pierre-Loup A. Griffais (2):

  • radv: fix null descriptor for dynamic buffers

  • radv: fix vertex buffer null descriptors

Qiang Yu (6):

  • radeonsi: remove emacs style config file

  • panfrost: don’t always build bifrost_compiler

  • radeonsi: fix syncobj wait timeout

  • radeonsi: fix user fence space when MCBP is enabled

  • radeonsi: fix max syncobj wait timeout

  • radeonsi: fix user fence GPU address

Rafael Antognolli (8):

  • intel: Store the aperture size in devinfo.

  • intel/isl: Update mocs for DG1

  • intel/l3: Return the URB size from devinfo for DG1

  • intel/devinfo: Add function to check for DRM_I915_GEM_GET_TILING.

  • iris/bufmgr: Do not use map_gtt or use set/get_tiling on DG1

  • anv/dg1: Don’t use SET_TILING kernel uapi.

  • iris: Align last_seqnos to 64 bits.

  • anv: Align “used” attribute to 64 bits.

Rhys Kidd (5):

  • nv50_2d: regenerate envytools-based rnndb headers

  • nv50_2d,nvc0_2d: Document SET_PIXELS_FROM_MEMORY_SAFE_OVERLAP from rnndb

  • nvc0_2d: Document SET_PIXELS_FROM_MEMORY_CORRAL_SIZE from rnndb

  • nvc0: fix macro define for NVE4_COPY()

  • nvc0: add documentation for nve4+ (Kepler) COPY class

Rhys Perry (174):

  • aco: remove use of f-strings

  • aco: add message to static_assert

  • nir: add missing group_memory_barrier handling

  • compiler/spirv: flag nclamp/nmin/nmax as exact

  • nir: make fsat return 0.0 with NaN instead of passing it through

  • docs: add src/amd/ to sourcetree.html

  • docs/envvars: document ACO_DEBUG

  • docs/envvars: update RADV_FORCE_FAMILY

  • aco: simplify consecutive ordered vmem/lds writes optimization

  • aco: fix consecutively written vgprs from vmem instructions

  • aco: mark phi definitions as last-seen phi operands

  • aco: consider affinities when creating v_mac_f32

  • aco: improve phi affinities with p_split_vector

  • aco: split operations that use a swap’s definition

  • aco: fix disassembly with LLVM 11

  • nir/opt_if: run opt_peel_loop_initial_if after all other optimizations

  • nir/opt_if: use nir_src_as_bool in opt_peel_loop_initial_if helper

  • aco: fix typo in insert_waitcnt’s kill()

  • nir: fix lowering to scratch with boolean access

  • aco: fix interaction with 3f branch workaround and p_constaddr

  • aco: consider SDWA during value numbering

  • aco: check instruction format before waiting for a previous SMEM store

  • aco: preserve more fields when combining additions into SMEM

  • aco: don’t reorder barriers in the scheduler

  • aco: fix 64-bit shared_atomic_exchange

  • docs: add missing “shader_” in VK_KHR_shader_subgroup_extended_types

  • radv: set keep_statistic_info with RADV_DEBUG=shaderstats

  • ac/gpu_info, radv: set max_wave64_per_simd to 20 on GFX10

  • aco: use v_xor3_b32

  • aco: validate instructions reading/writing upper halves/bytes

  • aco: p_extract_vector in 64-bit u2f16/i2f16

  • aco: allow reading/writing upper halves/bytes when possible

  • aco: prefer 4-byte aligned definitions

  • aco: add Info::{operand_size,definition_size}

  • aco: use Info::definition_size instead of definition’s regclass

  • aco: fix moving sub-dword values out of a register for a fixed definition

  • aco: use num_opcodes instead of last_opcode

  • aco: improve code for f2{i,u}{8,16}

  • aco: use p_as_uniform in emit_vop1_instruction

  • aco: add and set precise flag

  • aco: create mads when signed zeros should be preserved

  • aco: try to use fma instead of mad when denormals are enabled

  • aco: create 16-bit mad/fma

  • aco: update comment about preserving fp16/fp64 denormals

  • aco: create 16-bit input and output modifiers

  • aco: improve sub-dword check for sgpr/constant propagation

  • aco: fix half_pi constant for 16-bit fsin/fcos

  • aco: use 32-bit inline constants for 16-bit integer instructions

  • aco: improve 8/16-bit constants

  • aco: copy-propagate constants through p_extract_vector/p_split_vector

  • aco: optimize 16-bit and 64-bit float comparisons

  • aco: validate sub-dword pseudo instructions

  • aco: add more opcodes to can_swap_operands

  • aco: allow GFX9 partial writes with instructions which use opsel

  • aco: improve check for moving temporaries out of fixed definitions

  • aco: fix encoding of certain s_setreg_imm32_b32 instructions

  • aco: fix validation error from vgpr spill/restore code

  • aco: fix sub-dword opsel/sdwa checks

  • aco: fix validation of opsel when set for the definition

  • aco: shrink ssa_info

  • aco: make ssa_info::label 64-bit

  • aco: shrink mad_info

  • aco: fix edge check with sub-dword temporaries

  • aco: use the same regclass as the definition for undef phi operands

  • radv: add new drirc option radv_no_dynamic_bounds

  • radv: enable radv_no_dynamic_bounds for Path of Exile

  • radv: enable radv_no_dynamic_bounds for more Path of Exile executables

  • nir: slight correction to cube_face_coord constant folding

  • spirv: set variables to restrict by default

  • radv: fix image variable types in meta shaders

  • aco: only use SMEM if we can prove it’s safe

  • aco: allow SMEM for some sub-dword accesses

  • radv/aco,aco: allow SMEM SSBO loads on GFX6/7

  • aco: fix copy+paste error in split_buffer_store

  • aco: don’t store byte-aligned short stores

  • aco: add missing bld.scc() in byte_align_scalar()

  • aco: don’t create byte-aligned short loads

  • aco: fix when sub-dword create_vector operand cannot be placed perfectly

  • aco: improve vectorization of 8/16-bit loads/stores

  • aco: ignore blocked registers when checking edges in get_reg_impl()

  • aco: remove outdated assert in handle_operands()

  • radv: enable zerovram for Quantic Dream games

  • aco: use VOP2 version of v_mbcnt_hi_u32_b32 on GFX6/7

  • aco: rework boolean phi pass

  • aco: create better code for boolean phis with constant operands

  • aco: optimize boolean phis with uniform selections

  • aco: don’t create phis with undef operands in the boolean phi pass

  • aco: read 0 from inactive lanes when using dpp

  • aco: optimize some masked swizzles to DPP

  • aco: implement <32-bit masked_swizzle_amd

  • nir/lower_subgroups: pass options struct to lower_shuffle

  • nir/lower_subgroups: add lower_shuffle_to_swizzle_amd

  • radv: use lower_shuffle_to_swizzle_amd

  • aco: add 32-bit integer addition to can_swap_operands

  • aco: fix underestimated pressure in spiller when a phi has a killed def

  • aco: rewrite graph coloring in spiller

  • aco: use unordered_set for spill id interferences

  • aco: add add_interference() helper

  • aco: use s_round_mode/s_denorm_mode

  • aco: flush denormals before fp16 fabs/fneg if needed

  • aco: fix nir_op_f2f16_rtne with non-default rounding modes

  • aco: set tcs_in_out_eq=false if float controls of VS and TCS stages differ

  • radv: enable more float_controls features

  • aco: properly recognize that s_waitcnt mitigates VMEMtoScalarWriteHazard

  • aco: use s_waitcnt_depctr to mitigate VMEMtoScalarWriteHazard

  • spirv: don’t split memory barriers

  • nir/lower_int64: lower 64-bit amul

  • aco: always set FI on GFX10

  • radv: replace discard with demote for Quantic Dream games

  • aco: implement b2i8/b2i16

  • aco: be more careful combining additions that could wrap into loads/stores

  • aco: allow overflow for some SMEM instructions

  • aco: add NUW flag

  • nir: add nir_unsigned_upper_bound and nir_addition_might_overflow

  • aco: use nir_addition_might_overflow to combine additions into SMEM

  • aco: move some setup code into helpers

  • aco: make validate() usable in tests

  • aco: print ACO IR before scheduling instead of after

  • radv: fix invalid conversion warnings in vk_format.h

  • aco: fix copy of uninitialized boolean

  • aco: fix includes in aco_ir.cpp

  • aco: add missing add_to_hazard_query

  • aco: rework barriers and replace can_reorder

  • radv/aco,aco: use scoped barriers

  • aco: consider intrinsic access in visit_{load,store}_image

  • nir,radv/aco: add and use pass to lower make available/visible barriers

  • aco: enable value numbering of s_buffer_load_*

  • aco: use storage_scratch

  • aco: improve sync_info for TCS output stores

  • aco: improve workgroup-scope and lower vmem/smem barriers

  • aco: create acq+rel barriers instead of acq/rel

  • nir/load_store_vectorize: fix indentation

  • ac/nir: implement scoped_barrier

  • radv: use scoped barriers

  • aco: remove isel for GLSL-style barriers

  • aco: add framework for unit testing

  • aco: add a few tests for the assembler and optimizer

  • aco: add framework for testing isel and integration tests

  • ci: enable ACO tests

  • aco/tests: add tests for sub-dword swaps

  • aco: optimize swizzled SALU 8/16-bit conversions

  • aco: fix waitcnt insertion on GFX10.3

  • aco: don’t create v_mad_f32 on GFX10.3

  • aco: update bug workarounds for GFX10_3

  • aco: fix max_waves_per_simd on Polaris, VegaM and GFX10.3

  • aco: update vgpr_alloc_granule for GFX10.3

  • aco: implement subgroup shader_clock on GFX10.3

  • aco: update aco_opcodes.py for GFX10.3

  • aco: disable SMEM stores on GFX10.3

  • aco: replace MADs in isel with FMA on GFX10.3

  • spirv: set ACCESS_COHERENT for ssbo/global/image atomic load/store

  • radv/aco: enable VK_KHR_memory_model

  • ac/nir: consider an image load/store intrinsic’s access

  • ac/nir: fix coherent global loads/stores

  • radv/llvm: enable VK_KHR_memory_model

  • aco: fix C++11/C++14 compilation

  • aco: set constant_data_offset correctly in the case of merged shaders

  • aco: don’t move memory accesses to before control barriers

  • aco: fix non-rtz pack_half_2x16

  • aco: consider branch definitions in spiller

  • aco: don’t consider the first partial spill if it’s the wrong type

  • aco: don’t fix break condition for break+discard to exec

  • aco: fix regclass checks when fixing to vcc/exec with Builder

  • aco: fix spills_entry heuristic for branch blocks in init_live_in_vars()

  • aco: keep loop live-through variables spilled

  • aco: reserve 2 sgprs for each branch

  • aco: create long jumps

  • aco: fix byte_align_scalar for 3 dword vectors

  • aco: fix one-off error in Operand(uint16_t)

  • nir/opt_if: fix opt_if_merge when destination branch has a jump

  • aco: fix v_writelane_b32 with two sgprs

  • aco: don’t apply constant to SDWA on GFX8

  • radv: initialize with expanded cmask if the destination layout needs it

  • radv,aco: fix reading primitive ID in FS after TES

Rob Clark (265):

  • util/simple_mtx: add assert_locked()

  • freedreno: add screen lock wrappers

  • freedreno: switch to simple_mtx

  • freedreno: fix buffer import

  • gallium: extract out logicop helper

  • freedreno/drm: drop atomic refcnts

  • freedreno/drm: inline the things

  • freedreno/a6xx: small query cleanup

  • freedreno/a6xx: avoid unnecessary clearing VS DP state

  • freedreno/a6xx: move const state to single stateobj

  • freedreno/a6xx: move scissor state to stateobj

  • freedreno/a6xx: limit PROG_FB_RAST state emit

  • freedreno/a6xx: limit LRZ state emit

  • freedreno/a6xx: move blend-color to stateobj

  • freedreno/a6xx: combine sample mask into blend state

  • freedreno/a6xx: skip unnecessary MRT blend state

  • freedreno/a6xx: add OUT_PKT()

  • freedreno/a6xx: convert draw packet to OUT_PKT()

  • freedreno/a6xx: split out const emit

  • freedreno/ir3: inline const emit

  • freedreno/a6xx: convert const emit to OUT_PKT()

  • freedreno: scissor vs disabled scissor micro-opt

  • freedreno/a6xx: more OUT_REG()

  • freedreno: sync registers with envytools

  • freedreno/a6xx: don’t set SP_FS_CTRL_REG0.VARYING for fragcoord

  • freedreno/a6xx: fix LRZ hang

  • freedreno/a6xx: add some more formats

  • freedreno: we don’t need aligned vbo’s

  • freedreno/a6xx: compressed blit fixes

  • freedreno/a6xx: enable tiled compressed textures

  • freedreno/gmem: don’t assume scissor opt when estimating # of bins

  • freedreno: initialize max_scissor

  • freedreno/gmem: add div_align() helper

  • freedreno/gmem: add helper to dump GMEM layout

  • freedreno: add gmemtool

  • freedreno/gmem: relax alignment on a6xx

  • freedreno/gmem: rework gmem layout algo

  • freedreno/ir3: don’t allow negative const_offset

  • freedreno/ir3: fix indirect cb0 load_ubo lowering

  • freedreno/ir3: limit # of tex prefetch by shader size

  • freedreno/ir3/postsched: reset sfu_delay on sync

  • freedreno/ir3/postsched: try to avoid (sy) syncs

  • freedreno/ir3/sched: avoid scheduling outputs

  • freedreno/ir3/sched: try to avoid syncs

  • freedreno/a6xx: fix max-scissor opt

  • freedreno/ir3: use const_index accessors

  • nir: fix indices for ir3 ssbo_atomic intrinsics

  • nir: add helper to copy const_index[]

  • nir: add pass to lower disjoint wrmask’s

  • freedreno/ir3: use lower_wrmasks pass

  • freedreno/fdperf: add dependency on generated headers

  • freedreno/drm: don’t pass thru ‘DUMP’ flag on older kernels

  • freedreno/drm: handle ancient kernels

  • freedreno/ir3: remove Sethi-Ullman numbering pass

  • freedreno/ir3: juggle around ir3_debug_print()

  • freedreno/ir3/dce: report progress

  • freedreno/cf: report progress

  • freedreno/ir3/cp: report progress

  • freedreno/ir3/deps: report progress

  • freedreno/ir3/group: report progress

  • freedreno/ir3/legalize: report progress

  • freedreno/ir3/postsched: report progress

  • freedreno/ir3: add IR3_PASS() macro

  • freedreno/ir3: move where we preserve binning pass inputs

  • freedreno/ir3: be iterative

  • freedreno/ir3: make foreach_src declare cursor ptr

  • freedreno/ir3: make foreach_ssa_src declar cursor ptr

  • freedreno/ir3: make input/output iterators declare cursor ptr

  • freedreno/ir3/group: fix for half-regs

  • freedreno/ir3: fix mismatched flags on split

  • freedreno/ir3/cf: handle multiple cov’s properly

  • freedreno/ir3: fix immed type in create_addr0()

  • freedreno/ir3/print: print cat2 condition

  • freedreno/ir3/cp: fix cmps folding

  • freedreno/ir3: fix mismatched wrmask for overlapping VS inputs

  • freedreno/ir3: add simple validate pass

  • freedreno/ir3: add helpers to deal with src/dst types

  • freedreno/ir3/validate: add checking for types and opcodes

  • freedreno/drm: disallow exported buffers in bo cache

  • freedreno: add batch debugging

  • freedreno: clear last_fence after resource tracking

  • freedreno: handle PIPE_TRANSFER_MAP_DIRECTLY

  • freedreno/gmem: make noscis debug actually do something on a6xx

  • freedreno/gmemtool: make GMEM alignment per-gen

  • freedreno/gmemtool: add a405

  • freedreno/gmemtool: add verbose mode

  • freedreno/gmem: add some asserts

  • freedreno/gmem: fix nbins_x/y mismatch

  • freedreno/gmem: split out helper to calc # of bins

  • freedreno/a6xx: LRZ fix for alpha-test

  • freedreno/a6xx: document LRZ flag buffer

  • freedreno/a6xx: fix vsc assert

  • nir: get_base_type() should return enum type

  • nir: extract out convert_to_bitsize() helper

  • nir/builder: add bitsize conversion helpers

  • nir/lower_tex: fixes for fp16 yuv lowering

  • freedreno/ir3: split kill from no_earlyz

  • freedreno/a6xx: sync registers from envytools

  • freedreno/a6xx: update depth-plane control regs

  • freedreno/a6xx: re-work LRZ state tracking

  • freedreno/a6xx: add early-lrz-late-z mode

  • freedreno/a6xx: also consider alpha-test for ztest-mode

  • freedreno/a6xx: more early-z

  • freedreno/computerator: fix missing dependency on generated header

  • nir/print: print tex dest type

  • freedreno/ir3: add debug code to print conflicting half-regs

  • freedreno/ir3: respect tex prefetch limits

  • freedreno/ir3: remove RA “q-values” optimization

  • freedreno/ir3: limit pre-fetched tex dest

  • freedreno/ir3: unify shader create/delete paths

  • freedreno/ir3: move the libdrm dependency out of shared code

  • turnip: drop linking libfreedreno_drm

  • freedreno/ir3: don’t rely on intr->num_components

  • radv: don’t set num_components for non-vectorized intrinsics

  • nir/builder: don’t set intr->num_components

  • nir/lower-atomics-to-ssbo: don’t set num_components

  • spriv: don’t set num_components for non-vectorised intrinsics

  • v3d: don’t use intr->num_components for non-vectorized intrinsics

  • nir/validate: validate intr->num_components

  • freedreno/log-parser: fix compute times

  • freedreno/sched: reset delay counters at start of block

  • freedreno/ir3/validate: also check instr->address

  • freedreno/ir3/cp: properly handle already-folded RELATIV

  • freedreno: splitup emit_string_marker

  • freedreno/a6xx: emit shader names in debug builds

  • freedreno/ir3/legalize: don’t allow (nopN) if (rptN)

  • freedreno/ir3/print: print (r) flag

  • freedreno/ir3: add test for delay slot calculation

  • freedreno/ir3/delay: calculate delay properly for (rptN)’d instructions

  • freedreno/ir3: add helpers to move instructions

  • freedreno/ir3: delay test support for vectorish instructions

  • freedreno/ir3/cp: extract valid_flags

  • freedreno/ir3: add post-scheduler cp pass

  • freedreno/ir3: convert regmask_t to struct

  • freedreno/ir3: move mergedreg state out of reg

  • freedreno/ir3: decouple regset from gpu gen

  • freedreno/ir3: pass variant to postsched

  • freedreno/ir3: re-work assembler API

  • freedreno/ir3: make mergedregs a property of the variant

  • freedreno/a6xx: set .MERGEREGS based on variant

  • turnip: set .MERGEDREGS based on variant

  • freedreno/computerator: MERGEDREGS update

  • freedreno/ir3: update obsolete comment

  • spirv: atomic_counter_read_deref is not vectorized

  • spirv: drop some dead code

  • glsl_to_nir: fix is_helper_invocation

  • glsl_to_nir: fix shader_clock

  • glsl_to_nir: fix vote_any/vote_all

  • freedreno/ir3: refactor out helper to compile shader from asm

  • freedreno/ir3: add accessor for const_state

  • freedreno/a6xx: defer userconst cmdstream size calculation

  • freedreno/ir3: move ubo_state into const_state

  • freedreno/ir3: drop shader->num_ubos

  • freedreno/ir3: constify shader key

  • freedreno/ir3: pass variant to ir3_create()

  • freedreno/ir3: convert over to ralloc

  • freedreno/ir3: move num_reserved_user_consts out of const_state

  • freedreno/ir3: un-embed const_state

  • freedreno/ir3: move const_state back to variant

  • freedreno/ir3: move output_loc to variant

  • freedreno/ir3: split out ubo info from range

  • freedreno/ir3: splitup get_existing_range()

  • freedreno/ir3: split ubo analysis/lowering passes

  • ci: remove some freedreno a6xx skips

  • freedreno/ir3: add helper to determine point-coord inputs

  • freedreno/a6xx: de-duplicate vinterp/vpsrepl state building

  • freedreno/a6xx: use point-coord helper

  • freedreno/a5xx: use point-coord helper

  • freedreno/a4xx: use point-coord helper

  • freedreno/a3xx: use point-coord helper

  • freedreno: convert builtin blit VS prog to ureg builder

  • freedreno/ir3: switch PIPE_CAP_TGSI_TEXCOORD

  • freedreno: make foreach_bit() declare it’s cursor

  • freedreno: split out batch draw tracking helper

  • freedreno: split out batch clear tracking helper

  • freedreno: handle batch flush in resource tracking

  • freedreno/ir3/ra: fix pre-color edge case

  • freedreno/ir3: add ir3_finalize_nir()

  • freedreno/ir3: move finalize_nir to pscreen hook

  • freedreno/ir3: add ir3_compiler_destroy()

  • freedreno/ir3: shuffle some variant fields

  • freedreno/a6xx+ir3: stop generating pointless binning shaders

  • freedreno/ir3: build binning variant at same time as draw variant

  • freedreno/ir3: disk-cache support

  • freedreno/ir3: move nir finalization to after cache miss

  • freedreno/fdperf: fix print of base address

  • freedreno/fdperf: better compatible string matching

  • freedreno/fdperf: prefer render node

  • gitlab-ci: reduce a630 runner load

  • freedreno/ir3: add missing VS driver params

  • freedreno/ir3: make compile fails more visible

  • freedreno/a6xx: bail instead of crash for compile fails

  • freedreno/ir3/ra: be better at failing

  • freedreno/a6xx: don’t enable early-z/lrz if no z-test

  • freedreno/ir3: DCE unused arrays

  • driconf: allowlist/denylist

  • gitlab-ci: re-enable all a630 jobs

  • freedreno: small comment re-word

  • freedreno: whitespace fix

  • freedreno/ir3/parser: half-precision relative regs

  • freedreno/ir3: set array precision on creation

  • freedreno/ir3: fix half-reg array stores

  • freedreno/ir3/ra: debug msgs tweak

  • freedreno/ir3/ra: assign vreg names to all array elements

  • freedreno/ir3/ra: fix array conflicts for split/merged

  • freedreno: sync registers from envytools

  • freedreno: make gen_header.py check parent directory

  • freedreno: slurp in rnndb

  • freedreno: slurp in rnn

  • freedreno: slurp in decode tools

  • freedreno: slurp in afuc

  • freedreno/rnn: warnings cleanup

  • freedreno/decode: warnings cleanup

  • freedreno/afuc: warnings cleanup

  • freedreno: add CI for envytools tools

  • freedreno/ir3: split out regmask

  • freedreno: drop shader_t

  • freedreno: deduplicate a3xx+ disasm

  • freedreno: move a2xx disasm out of gallium

  • freedreno: deduplicate a2xx disasm

  • freedreno/ci: add a2xx trace to CI job

  • freedreno/tools: check rnn parse status

  • freedreno/rnn: split out helper to find files

  • freedreno/rnn: add error helper

  • freedreno/rnn: rename schema file

  • freedreno/rnn: update schema for ‘pos’

  • freedreno/rnn: add relaxed boolean type

  • freedreno/rnn: add high/low/pos to registers

  • freedreno/rnn: add radix/align

  • freedreno/rnn: relax Hexadecimal to HexOrNumber

  • freedreno/rnn: add variants/varset to domain

  • freedreno/registers/a2xx: fix validation error

  • freedreno/registers/a4xx: fix validation error

  • freedreno/registers/adreno_pm4: fix validation errors

  • freedreno/rnn: describe copyright element in schema

  • freedreno/rnn: add “addvariant” to schema

  • freedreno/rnn: allow name to be optional in arrays

  • freedreno/rnn: fix use-group

  • freedreno/registers/mdp5: fix validation error

  • freedreno/rnn: schema updates for dynamic/irregular offsets

  • freedreno/rnn: add schema validation

  • freedreno/rnn: headergen2 warnings cleanup

  • freedreno/decode: cffdec warnings cleanup

  • freedreno/ir3: add missing track_ubo_use()

  • freedreno/a6xx: don’t emit a bogus size for empty cb slots

  • freedreno/a6xx: fixup draw state earlier

  • freedreno/rnn: also look for .xml.gz

  • freedreno/rnn: rework RNN_DEF_PATH construction

  • freedreno/registers: add .gitignore

  • freedreno/registers: split header build into subdirs

  • freedreno/registers: install gzip’d register database

  • freedreno/decode: move dependencies up a level

  • freedreno: allow fence_fd fences to be recycled

  • freedreno/ir3: ir3_cmdline updates

  • freedreno/ir3: lower local_index using local_id

  • glsl/lower_precision: split out const lowering

  • gallium: replace 16BIT_TEMPS cap with 16BIT_CONSTS

  • glsl: remove LowerPrecisionTemporaries

  • glsl: don’t inline intrinsics for mediump

  • glsl_to_nir: fix bitfield_extract with 16-bit operands

  • freedreno/registers: add some missing regs to build

  • freedreno/crashdec: handle section name typos

  • freedreno/a6xx: fix occlusion query with more than one tile

  • freedreno: handle case of shadowing current render target

  • freedreno/gmemtool: add tile_alignw/h and a650

Rohan Garg (3):

  • iris: Fix documentation for _iris_batch_flush

  • ci: Include trace replay support in ARM rootfses.

  • gitlab-ci: Replay traces on lava devices

Roland Scheidegger (1):

  • gallivm: fix half to float conversions with llvm 11

Roman Gilg (2):

  • vulkan/wsi/x11: add sent image counter

  • vulkan/wsi/x11: wait for acquirable images in FIFO mode

Roman Stratiienko (5):

  • egl: Build surfaceless platform on Android

  • Android: Fixes for Q and R

  • panfrost: Android build fixes 2020 week 31

  • lima: Fix lima_screen_query_dmabuf_modifiers()

  • android: freedreno: Another build fix

Sagar Ghuge (3):

  • iris: Use modfiy disables for 3DSTATE_WM_DEPTH_STENCIL command

  • intel/compiler: Optimize integer add with 0 into mov

  • intel/compiler: Remove unnecessary optimization for MUL

Samuel Pitoiset (235):

  • ci: fix reporting the number of unexpected/flakes

  • ci: add lists of expected failures & skipped tests for RAVEN with ACO

  • aco: remove unecessary p_split_vector with v2b reg class

  • radv: enable shaderInt16 unconditionally with LLVM and only GFX8+ with ACO

  • radv: cleanup radv_CreateInstance()

  • radv: rename radv_devices() to radv_enumerate_physical_devices()

  • radv: fix a memleak if the physical device initialization failed

  • radv: report INITIALIZATION_FAILED when the amdgpu winsys init failed

  • radv: don’t report error with other vendor DRM devices

  • radv: use a linked list for physical devices

  • radv: display an error message if the winsys init failed

  • radv/winsys: do not count visible VRAM buffers twice in the budget

  • ci: remove unused .test-radv-fossilize rule

  • ci: set ACO_DEBUG=validateir,validatera global for RADV testing

  • ci: run radv-fossils with Pitcairn (GFX6) and Bonaire (GFX7) too

  • radv: remove the LLVM version string when ACO is used

  • radv: do not print the LLVM version string twice in hang reports

  • radv: report correct backend IR in hang reports when ACO is used

  • aco: fix 64-bit trunc with negative exponents on GFX6

  • nir: do not vectorize load/store if offset can overflow and robustness enabled

  • aco: prevent invalid loads/stores vectorization if robustness is enabled

  • radv: limit the Vulkan version to 1.1 for Android

  • radv: handle different Vulkan API versions correctly

  • radv: update the list of allowed Android extensions

  • aco: optimize add/sub(a, cndmask(b, 0, 1, cond)) -> addc/subbrev_co(0, a, b)

  • radv: use the common base object type for VkDevice

  • radv: use the base object struct types

  • radv: implement VK_EXT_private_data

  • vulkan: import common code for generating extensions

  • radv: use the common code for generating extensions and dispatch tables

  • anv: use the common code for generating extensions and dispatch tables

  • turnip: use the common code for generating extensions and dispatch tables

  • radv: add a LLVM version string workaround for SotTR and ACO

  • aco: remove useless check for nir_tex_src_bias

  • aco: add support for texturing with clamped LOD

  • ac/llvm: add support for texturing with clamped LOD

  • radv: enable shaderResourceMinLod

  • spirv: handle OpCopyObject correctly with any types

  • radv: fix missing break in radv_GetPhysicalDeviceProperties2()

  • aco: store 16-bit temporary outputs as v2b

  • aco: convert 16-bit values before exporting MRTs

  • aco: allow to load/store 16-bit values in VMEM for tess and geom

  • aco: implement 8-bit/16-bit mov’s with p_create_vector

  • aco: implement 16-bit vertex fetches with tbuffer_load_format_d16_*

  • aco: validate v_interp_*_f16 as VOP3 instructions instead of VINTRP

  • aco: emit v_interp_*_f16 instructions as VOP3 instead of VINTRP

  • aco: implement 16-bit interp

  • aco: fix off-by-one error with 16-bit MTBUF opcodes on GFX10

  • radv/aco: enable storageInputOutput16 on GFX9+

  • aco: fix missing break in label_instruction()

  • radv: fix missing break in radv_GetPhysicalDeviceFeatures2()

  • radv: fix duplicated expression in ac_setup_rings()

  • radv/winsys: remove useless free in radv_amdgpu_create_bo_list()

  • aco: declare 8-bit/16-bit reduce operations

  • aco: implement 8-bit/16-bit reductions

  • aco: validate 8-bit/16-bit VGPR operands for readfirstlane/readlane/writelane

  • aco: implement 8-bit/16-bit nir_intrinsic_read_first_invocation

  • aco: implement 8-bit/16-bit nir_intrinsic_{shuffle,_read_invocation}

  • aco: implement 8-bit/16-bit nir_intrinsic_quad_*

  • aco: use a temporary SGPR for 8-bit/16-bit literal reduction identities

  • aco: sign-extend the input and identity for 8-bit subgroup operations

  • radv: do not return from radv_GetPhysicalDeviceFeatures2()

  • radv: cleanup physical device features

  • radv: remove useless assignment in build_streamout_vertex()

  • spirv: add ReadClockKHR support with device scope

  • aco: implement nir_intrinsic_shader_clock with device scope

  • ac/nir: fix shader clock with subgroup scope

  • ac/nir: implement nir_intrinsic_shader_clock with device scope

  • radv: advertise shaderDeviceClock on GFX8+

  • spirv: add SpvCapabilityImageGatherBiasLodAMD

  • spirv: add support for bias/lod with OpImageGather

  • ac/nir: add support for bias/lod with texture gather

  • aco: add support for bias/lod with texture gather

  • radv: add support for querying which formats support texture gather LOD

  • radv: advertise VK_AMD_texture_gather_bias_lod

  • spirv,radv,anv: implement no-op VK_GOOGLE_user_type

  • radv/aco: enable VK_EXT_subgroup_size_control

  • aco: fix register allocation for subdword instructions on GFX10

  • aco: implement 8-bit/16-bit reductions on GFX10

  • aco: allocate a temp VGPR for some 8-bit/16-bit reduction ops on GFX10

  • aco: allow gfx10_wave64_bpermute with 8-bit/16-bit input

  • aco: sign-extend input/indentity for 32-bit reduce ops on GFX10

  • radv/aco: enable VK_KHR_subgroup_extended_types on GFX8+

  • radv: enable zero VRAM for Doom Eternal

  • radv: enable zero VRAM for all VKD3D (DX12->VK) games

  • aco: implement 16-bit reduce operations on GFX6-GFX7

  • aco: implement 16-bit nir_intrinsic_quad_* on GFX6-GFX7

  • aco: fix subdword copies on GFX6-GFX7

  • aco: sign-extend input/identity for 16-bit subgroup ops on GFX6-GFX7

  • radv/aco: enable 64-bit atomic features if RADV is linked with LLVM 8

  • aco: use v_bfe_u32 for unsigned reductions sign-extension on GFX6-GFX7

  • aco: fix sign-extend 8-bit subgroup operations on GFX6-GFX7

  • aco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7

  • radv/aco: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7

  • ac/nir: adjust an assertion for D16 on GFX6-GFX7

  • nir/lower_explicit_io: fix NON_UNIFORM access for UBO loads

  • radv/llvm: expose VK_EXT_shader_demote_to_helper_invocation with LLVM 9+

  • aco: implement 8-bit/16-bit conversions on GFX6-GFX7

  • aco: fix alignment of vectors with 4 elements

  • radv/aco: enable 8-bit/16-bit storage on GFX6-GFX7

  • radv/aco: enable shaderInt16 on GFX6-GFX7

  • radv/aco: enable shaderInt8 and VK_KHR_shader_float16_int8 on GFX6-GFX7

  • ac/nir: fix integer comparisons with pointers

  • radv: set DB_SHADER_CONTROL.CONSERVATIVE_Z_EXPORT correctly

  • radv: add new drirc option radv_enable_mrt_output_nan_fixup

  • aco: implement radv_enable_mrt_output_nan_fixup workaround

  • radv/llvm: implement radv_enable_mrt_output_nan_fixup workaround

  • radv: enable radv_enable_mrt_output_nan_fixup for RAGE 2

  • ac: add ac_choose_spi_color_formats() to common code

  • spirv: fix using OpSampledImage with OpUndef instead of OpType{Image,Sampler}

  • aco: allow to swap operands for some 16-bit float instructions

  • spirv: do not set num_components for non-vectorized mbcnt_amd intrinsic

  • radv/aco: enable FP16 features/extensions on GFX9+

  • radv: lower discards to demote to workaround a RDR2 game bug

  • radv: make sure to set CB_SHADER_MASK correctly for internal CB operations

  • radv: compute CB_SHADER_MASK from the fragment shader outputs

  • radv: only requires LLVM 9 for GFX10 if not using ACO

  • radv: replace == GFX10 with >= GFX10 where it’s needed

  • aco: replace == GFX10 with >= GFX10 where it’s needed

  • radv: add support for Sienna Cichlid

  • radv: require LLVM 11+ for GFX 10.3 if not using ACO

  • aco: fix printing ASM on GFX6-7 if clrxdisasm is not found

  • aco: improve validation checks for readlane/writelane

  • aco: fix printing ASM on GFX6-7 again

  • gitlab-ci: stop testing RADV with LLVM

  • gitlab-ci: update the list of expected CTS failures for RADV/ACO

  • gitlab-ci: update the list of expected failures for Pitcairn

  • radv: fix checking the return value of cs_finalize()

  • gitlab-ci: add parallel-rdp fossils

  • radv: lower 64-bit drcp/dsqrt/drsq for fixing precision issues

  • radv: lower 64-bit dfloor on GFX6 for fixing precision issues

  • gitlab-ci: add a list of expected failures for RADV/ACO on NAVI14

  • gitlab-ci: set the number of Fossilize threads to 4

  • gitlab-ci: append Fossilize stdout/stderr to a file to reduce spam

  • gitlab-ci: attach the Fossilize log file as artifact on failure

  • radv: remove the shader ballot workaround for Youngblood with LLVM

  • radv: remove the load/store workaround for Monster Hunter World with LLVM

  • radv: enable VK_AMD_shader_ballot on GFX6-7 with both compiler backends

  • radv: adjust CB_SHADER_MASK for dual-source blending in the shader info pass

  • radv: rework 8/16-bit color attachment formats detection

  • radv: use SPI_SHADER_ZERO for non-written color attachments

  • radv: add support for MRTs compaction to avoid holes

  • radv: fix wide points and lines

  • radv: fix wide lines with multisample enabled

  • Revert “vulkan/wsi/x11: Ensure we create at least minImageCount images.”

  • radv,vulkan: add a new x11 wsi drirc workaround for DOOM Eternal

  • radv: disable FMASK compression when drawing with GENERAL layout

  • radv: set depth/stencil enable values correctly for the meta clear path

  • radv: implement missing VK_ACCESS_MEMORY_{READ,WRITE}_BIT

  • radv: store the primitive topology hardware value in the pipeline

  • radv: adjust IA_MULTI_VGT_PARAM.WD_SWITCH_ON_EOP at draw time

  • radv: adjust IA_MULTI_VGT_PARAM.PARTIAL_VS_WAVE at draw time

  • radv: compute prim_vertex_count at draw time

  • aco: fix more validation errors from vgpr spill/restore code

  • radv: return VK_ERROR_DEVICE_LOST if wait-for-idle failed or expired

  • radv: remove the secure compile support feature

  • radv: rework dynamic viewports/scissors support

  • radv: add VK_EXT_extended_dynamic_state but leave it disabled

  • radv: declare new extended dynamic states

  • radv: add support for dynamic cull mode and front face

  • radv: add support for dynamic primitive topology

  • radv: add support for dynamic and scissor count

  • radv: add support for dynamic depth/stencil states

  • radv: add support for dynamic vertex input binding stride

  • radv: advertise VK_EXT_extended_dynamic_state

  • radv: add the custom border color BO to the list of buffers

  • radv: destroy the base object if VkCreateQueryPool() failed

  • radv: destroy the base object if VkCreateRenderPass*() failed

  • radv: destroy the base object if VkCreateImage() failed

  • radv: destroy the base object if VkCreateBuffer() failed

  • radv: destroy the base object if VkCreateEvent() failed

  • radv: destroy the base object if VkCreateSemaphore() failed

  • radv: destroy the base object if VkCreateFence() failed

  • radv: destroy the base object if VkAllocateCommandBuffers() failed

  • radv: destroy the base object if VkCreateInstance() failed

  • radv/winsys: replace alloca() by malloc() everywhere

  • radv/winsys: pass the buffer list via the CS ioctl for less CPU overhead

  • radv: fix destroying the syncobj when exporting a fence FD

  • radv: fix the error code when exporting a semaphore/fence fails

  • radv: fix the error code when allocating a fresh imported syncobj fails

  • radv: optimize creating signaled syncobj with amdgpu_cs_create_syncobj2()

  • radv: split fence into two parts as enum+union.

  • radv: remove one useless goto in radv_queue_submit_deferred()

  • radv: improve the error messages when a CS submission failed

  • radv: return better Vulkan error codes when VkQueueSubmit() fails

  • radv: disable CPU caching for IBS to reduce fetch latency

  • radv/winsys: always allow GTT placements on APUs

  • radv: advertise VK_EXT_image_robustness

  • radv: do not perform read-modify-write with the upload BO

  • radv: disable CPU caching for the upload BO to reduce fetch latency

  • aco: add support for nir_intrinsic_shared_atomic_fadd

  • ac/nir: add support for nir_intrinsic_shared_atomic_fadd

  • radv: advertise VK_EXT_shader_atomic_float

  • radv: add missing return values check for some winsys calls

  • radv/winsys: check more allocation failures

  • radv/winsys: remove useless check when binding virtual buffers/images

  • radv/winsys: return a Vulkan error code when binding virtual buffers/images

  • radv/winsys: be more robust when a CS failed during recording

  • radv: remove declared but unused radv_pipeline::is_dual_src

  • radv: remove set but unused radv_pipeline::vertex_elements

  • radv: remove outdated TODO related to PA_SU_VTX_CNTL.PIX_CENTER

  • radv: emit more invariant registers as part of the initial gfx state

  • radv: emit PA_SC_LINE_CNTL as part of the rasterization state

  • radv: clean up VGT_SHADER_STAGES_EN emission

  • radv: clean up PA_SC_CLIPRECT_RULE emission

  • radv: reduce the number of allocated dwords for compute CS

  • radv: clean up radv_compute_generate_pm4()

  • radv: remove unnecessary radv_tessellation_state::num_patches

  • radv: remove no-op si_multiwave_lds_size_workaround()

  • radv: remove one unnecessary param to radv_generate_graphics_pipeline_key()

  • radv: align the LDS size in calculate_tess_lds_size()

  • radv: set LDS TCS size at shaders creation for GFX9+

  • radv: remove unnecessary radv_tessellation_state::lds_size

  • radv: clean up tessellation state emission

  • radv: add radv_pipeline_init_input_assembly_state()

  • radv: add radv_pipeline_generate_vgt_gs_out()

  • radv: clean up adjusting MSAA state if conservative rast is enabled

  • radv: clean up binning state initialization

  • radv: assign pipeline gfx fields before PM4 emission

  • radv: constify all radv_pipeline_generate_*() helpers

  • radv: add radv_pipeline_init_shader_stages_state()

  • radv: remove useless return value to radv_pipeline_scratch_init()

  • radv: clean up remaining pipeline init functions

  • radv: print warnings for famous RADV_PERFTEST options that no longer exist

  • radv: do not honor a user-specified pitch on GFX 10.3

  • radv: increase minimum NGG vertex count requirement per workgroup on GFX 10.3

  • radv: fix sample shading on GFX 10.3

  • radv: set BYPASS_VTX_RATE_COMBINER_GFX103 on GFX 10.3

  • radv/gfx10: add missing initialization of registers

  • radv: limit LATE_ALLOC_GS to prevent a GPU hang on GFX10

  • radv: fix emitting the border color pointer on the compute queue

  • nir/algebraic: mark some optimizations with fsat(NaN) as inexact

  • aco: handle unaligned loads on GFX10.3

  • spirv: fix emitting switch cases that directly jump to the merge block

  • radv: fix transform feedback crashes if pCounterBufferOffsets is NULL

Satyajit Sahu (1):

  • frontends/va: Handle dynamic resolution/SVC for VP9

Satyeshwar Singh (1):

  • intel/dev: Don’t consider all TGL SKUs as GT1 only

Serge Martin (3):

  • amd/common: Fix incorrect use of asprintf instead of vasprintf

  • clover: add more cl_mem_object_type to pipe_texture_target mapping

  • clover: implements clEnqueueFillBuffer

Shawn Guo (1):

  • freedreno/a4xx: fix *_NONE enum conversion

Simon Ser (3):

  • EGL: sync headers with Khronos

  • gbm: document that gbm_bo_map exposes a linear view

  • radv: use bitshifts for debug enum values

SureshGuttula (1):

  • radeon/vcn: Corrected vp9 ref associated data incase of target->codec is NULL

Tapani Pälli (14):

  • st/mesa: destroy only own program variants when program is released

  • anv: call base finish only if pass given in DestroyRenderPass

  • anv: add VK_EXT_extended_dynamic_state but leave it disabled

  • anv: add new dynamic states

  • anv: consider dynamic state when creating pipeline

  • anv: handle dynamic viewport count

  • anv: add support for dynamic cull mode and winding order

  • anv: add support for dynamic viewport and scissor with count

  • anv: add support for dynamic primitive topology change

  • anv: depth/stencil dynamic state support

  • anv: dynamic vertex input binding stride and size support

  • anv: toggle on VK_EXT_extended_dynamic_state

  • anv: add a check for depthStencilState before using it

  • anv: null check for buffer before reading size

Thong Thai (8):

  • radeon: Fix whitespaces

  • gallium/auxiliary/vl: Fix compute shader scaling for non-square pixels

  • gallium/auxiliary/vl: Fix compute shader scale_y for interlaced videos

  • frontends/va: Fix deinterlace bottom field first flag

  • frontends/vdpau: Default destination rect to source rect

  • radeon/vcn: add vcn 3.0 encode support

  • radeonsi: use PIPE_FORMAT_P010 for 10-bit VP9 decoding

  • radeon/vcn: increase render_pic_list size

Timothy Arceri (69):

  • glsl: stop cascading errors if process_parameters() fails

  • glsl: fix slow linking of uniforms in the nir linker

  • radv: fix regression with builtin cache

  • nir: add glsl_get_ifc_packing() helper

  • nir: add callback to nir_remove_dead_variables()

  • glsl: add can_remove_uniform() helper to the NIR linker

  • glsl: remove dead uniforms in the nir linker

  • glsl/spirv: remove dead uniforms in spirv nir linker

  • gitlab-ci: bump piglit checkout commit

  • i965: call brw_nir_lower_uniforms() after uniform linking is complete

  • util: add BITSET_LAST_BIT() helper

  • glsl: add struct to gather more info about uniform array access

  • glsl: add update_array_sizes() helper to the NIR uniform linker

  • glsl: gather uniform dereference info before main linking loop

  • glsl: when NIR linker enable use it to resize uniform arrays

  • glsl: fix potential slow compile times for GLSLOptimizeConservatively

  • glsl: fix incorrect optimisation in opt_constant_variable()

  • glsl: fix uniform array resizing in the nir linker

  • glsl: small optimisation fix for uniform array resizing

  • st_glsl_to_nir: fix potential use after free

  • mesa: remove _mesa prefix from static function

  • mesa: add _mesa_program_state_value_size() helper

  • glsl: define gl_LightSource members in ARB_vertex_program order

  • st/glsl_to_nir: disable st_nir_lower_builtin() when packing supported

  • glsl: remove stale FIXME

  • i965: add and fix fallthrough comments

  • llvmpipe: add missing fallthrough comments

  • gallivm: add missing break

  • anv: update fallthrough comment so gcc sees it

  • intel/compiler: add and fix up fallthrough comments for gcc warnings

  • iris: add missing fallthrough comment

  • egl: move fallthrough comment so gcc can see it

  • nir: add missing break to nir_opt_access()

  • mesa: fix fallthrough in glformats

  • mesa: add fallthrough comments to glformats.c

  • mesa: add fallthrough comments to get.c

  • nir: fix implicit fallthrough warnings

  • mesa: add fallthrough comments to COPY_SZ_4V()

  • radeonsi: add missing fallthrough comment

  • glx: add missing fallthrough comment

  • glsl: move fallthrough comment to where gcc can see it

  • radeon: add missing fallthrough comments

  • spirv: add missing fallthrough comments

  • mesa/vbo: add some missing fallthrough comments

  • mesa: add missing fallthrough comment to teximage.c

  • mesa: fix unintended fallthrough in glIsEnabled()

  • r300: add and fix up fallthrough comments

  • svga: add missing fallthrough comments

  • mesa: update fallthrough comment so gcc can see it

  • nv30: add missing fallthrough comment

  • meson: turn on Wimplicit-fallthrough project wide

  • nouveau: fix pointer-sign warning

  • gitlab-ci: Enable -Werror in meson-classic job

  • r600/radeonsi: silence zero-length-bounds gcc warnings

  • radeonsi: fix SI_NUM_ATOMS

  • iris: fix maybe-uninitialized warning for initial_state variable

  • iris: silence maybe-uninitialized for stc_dst_aux_usage variable

  • nouveau/nvc0: silence maybe-uninitialized warning

  • panfrost: add some missing fallthrough comments

  • panfrost: hide more unused code in bi_lower_combine.c

  • panfrost: add some missing fallthrough comments to bi_pack.c

  • freedreno: fix missing fallthrough comments

  • v3d: remove redefine of VG(x)

  • zink: fix missing fallthrough comment

  • nine: remove unused var

  • etnaviv: add missing fallthrough comments

  • lima: add missing fallthrough comments

  • lima: add missing break

  • gitlab-ci: Enable -Werror in meson-gallium job

Timur Kristóf (4):

  • aco/gfx10: Refactor of GFX10 wave64 bpermute.

  • aco: Implement subgroup shuffle on GFX6-7.

  • radv/aco: Always enable subgroup shuffle.

  • aco: Fix emit_boolean_exclusive_scan in wave32 mode.

Tomeu Vizoso (55):

  • panfrost: Emit blend descriptors on Bifrost

  • panfrost: Don’t leak temporary descriptors array

  • pan/decode: Check for correct unknown field

  • pan/decode: Use correct printf modifier for long int

  • panfrost: Split bit out of format.unk3

  • panfrost: Create additional BO for the checksum of imported BOs (Bifrost)

  • panfrost: Add a bit more info about some tiler fields

  • pan/bi: Print shaders only if BIFROST_MESA_DEBUG=shaders

  • pan/decode: Trace to stderr with PANDECODE_DUMP_FILE=stderr

  • panfrost: GPUs newer than G-71 don’t have swizzles…

  • panfrost: mali_attr_meta.unknown1 is zero on Bifrost

  • panfrost: Add Bifrost texture trampoline BO to batch

  • pan/decode: Properly print tripped zeroes

  • virgl: Properly check for encode_stride when encoding transfers

  • panfrost: Add checksum BOs to batch

  • panfrost: Don’t trample on top of Bifrost-specific unions

  • panfrost: Handle MALI_RGB8_UNORM in panfrost_format_to_bifrost_blend

  • gitlab-ci: Run more dEQP tests for virgl

  • gitlab-ci: Add manual tests for Virgl using GLES on the host

  • gitlab-ci: Test virgl with Khronos’ OpenGL CTS

  • gitlab-ci: Update CTS runner

  • ci: Don’t call renderdoc’s ReplayController.Shutdown()

  • ci: Move ARM rootfses to stable

  • gitlab-ci: Build kernel drivers for a few ethernet USB dongles

  • gitlab-ci: More stable URL for kernel and ramdisk artifacts, for LAVA

  • gitlab-ci: Remove left-behind rules:

  • gitlab-ci: Don’t rebuild kernels and rootfs if they have been already built in mainline

  • gitlab-ci: Run all of GLES3 tests for Panfrost

  • gitlab-ci: Re-add kernels for bare-metal

  • gitlab-ci: Download traces from MinIO

  • gitlab-ci: Upload tracie artifacts to MinIO

  • gitlab-ci: Fix needs: of the arm64 LAVA test jobs

  • ci: Upload images of failed replays to MinIO

  • ci: Use smaller glxgears trace

  • ci: Prefix tracie artifacts with the device name

  • ci: Test with more traces

  • ci: Disable trace testing on Mali T760

  • ci: Fix the overwriting of traces.yml for baremetal

  • ci: Namespace trace artifacts to the job number

  • ci: Always print status code of HTTP uploads in tracie

  • ci: Print load stats after running dEQP

  • ci: Fix URL for glslang

  • ci: Don’t ship vk-build-programs after building dEQP

  • ci: Split building of libdrm to its own script

  • ci: Build kernels and rootfs for x86 devices

  • ci: Upload reference images for traces

  • ci: Print URL to image diff when a trace replay fails

  • ci: Generate MinIO credentials within LAVA jobs

  • ci: Set date in LAVA DUTs from NTP servers

  • ci: Build-test Panfrost tools

  • ci: Upload traces’ reference and actual images to MinIO

  • ci: Download traces from MinIO in baremetal runs

  • ci: Remove kernel module build that slipped in

  • ci: Actually upload trace artifacts to MinIO for baremetal

  • ci: Use a rootfs tarball for NFS root, instead of a ramdisk (for LAVA)

Tony Wasserka (4):

  • nir/lower_idiv: Port recent LLVM fixes to emit_udiv

  • radv: Fix various non-critical integer overflows

  • aco: Fix integer overflows when emitting parallel copies during RA

  • amd/common: Fix various non-critical integer overflows

Vinson Lee (25):

  • freedreno: Add missing break statement.

  • llvmpipe: Fix variable name.

  • r600/sfn: Initialize VertexStageExportForGS m_num_clip_dist member variable.

  • panfrost: Ensure final.no_colour is initialized.

  • r600/sfn: Use correct setter method.

  • freedreno: Add missing va_end.

  • pan/bi: Initialize struct fma_op_info member extended.

  • zink: Check fopen result.

  • etnaviv: Fix memory leak on error path.

  • panfrost: Fix printf format specifier.

  • r300g: Remove extra printf format specifiers.

  • vdpau: Fix wrong calloc sizeof argument.

  • mesa: Fix NetBSD compiler macro.

  • Switch from cElementTree to ElementTree.

  • intel/genxml: Migrate from deprecated xml.etree.ElementTree getchildren.

  • rbug: Fix rbug_delete_vs_state lock acquisition.

  • nir: Add nir_lower_clip_disable.c to SCons build.

  • util: Fix SCons build.

  • util: Fix memory leaks in unit test.

  • meson: Fix lmsensors warning message.

  • vulkan: Fix memory leaks.

  • freedreno: Fix file descriptor leak.

  • svga: Fix unused printf argument.

  • freedreno: Check file descriptor before write.

  • panfrost: Delete debug allocated syncobj.

Yevhenii Kharchenko (1):

  • st/mesa: fix corrupted texture levels, when adding more levels than expected

Yevhenii Kolesnikov (5):

  • glsl: subroutine signatures must match exactly

  • nvir: don’t use designated initialisers in C++ code

  • intel/compiler: don’t propagate cmp to add if add is saturated

  • mesa: change error code of *TextureSubImage* for incorreect target

  • nine: fix incorrect calculation of layer count for 3D textures

jzielins (2):

  • gallium/swr: Fix compilation warnings

  • swr: Bump maximum 2D texture size to 16kx16k

mmenzyns (1):

  • nv50: Clear nv50_ir_prog_info of dead and codegen specific variables