Reduce pystring usages#2313
Conversation
|
I have temporarily removed a couple of combinations of our CICD Linux tests here to at least allow more tests to run. Since one tests from the OS category can stop the other jobs from running. There is an on-going discussions in Slack regarding some Linux combinations and the recent I will not merge this request without a conclusion is reached in that regards. The initial message from Jean-Francois Panisset: |
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Doug Walker <doug.walker@autodesk.com> Signed-off-by: Mei Chu <meimchu@gmail.com>
…ation#2315) * Fix 2023.3 Linux container break Signed-off-by: Doug Walker <doug.walker@autodesk.com> * Adjust to keep the name constant to use the existing GitHub branch rules Signed-off-by: Doug Walker <doug.walker@autodesk.com> --------- Signed-off-by: Doug Walker <doug.walker@autodesk.com> Signed-off-by: Mei Chu <meimchu@gmail.com>
…demySoftwareFoundation#2271) * Add DirectX 12 GPU backend for automated unit testing on Windows Introduce a DirectX 12 / HLSL rendering backend alongside the existing OpenGL / GLSL and Metal / MSL backends, enabling the GPU unit test suite to run natively on Windows without requiring an OpenGL context. Key changes: GraphicalApp abstract interface (graphicalapp.h/cpp) Backend-agnostic base class extracted from OglApp. OglApp and MetalApp now inherit from it. DxApp (dxapp.h/cpp) -- DirectX 12 backend Off-screen RGBA32F render target, full-screen triangle via SV_VertexID, staging readback, SM 6.0 DXC shader compilation. HLSLBuilder (hlsl.h/cpp) -- HLSL shader generation Translates GpuShaderDesc into HLSL pixel shaders with 1D and 3D LUT texture uploads in RGBA32F format. CMake integration OCIO_DIRECTX_ENABLED option, FetchContent for DirectX-Headers, auto-copy of DXC runtime DLLs to the test output directory. Test tolerance adjustments Minor epsilon increases for 4 tests due to DX12/SM6.0 FMA and pow() precision differences. All 263 GPU tests pass on the DirectX 12 backend. Build and run: # Configure (OCIO_DIRECTX_ENABLED defaults to ON on Windows) cmake -S . -B build -DCMAKE_BUILD_TYPE=Release # Build the GPU test binary cmake --build build --target test_gpu_exec --config Release # Run GPU tests with the DX12 backend ctest --test-dir build -C Release -R test_dx Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> * Fix post-rebase issues found in code review - HeadlessOglApp::printGraphicsInfo() was calling pure virtual base (crash on headless EGL) - graphicalapp.cpp included oglapp.h unconditionally; guard under OCIO_GL_ENABLED - tests/gpu/CMakeLists.txt early-return guard excluded Vulkan-only builds - Add missing test_vulkan ctest entry Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> * Minor additional comments, formatting and fixes. Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> * Speed up DX12 GPU test backend (~19%) The DX12 test suite was noticeably slower than the OpenGL and Vulkan backends. Profiling the run showed the gap was almost entirely in DXC shader compilation, not in Present, fence waits, or DxcCreateInstance as initially suspected. Three low-risk changes: - Cache IDxcUtils and IDxcCompiler3 as DxApp members instead of recreating them on every setShader() call. The COM instances are thread-safe and perfectly reusable; recreating them per test added no value. - Compile the full-screen-triangle vertex shader exactly once and reuse the bytecode across all tests. The VSMain HLSL is a hard-coded SV_VertexID-driven triangle with no test-specific state — the bytecode is identical every time. Extracted into a new ensureVertexShaderCompiled() helper. This alone eliminated the biggest redundancy (263 duplicate VS compiles). - Present(1, 0) → Present(0, 0). VSync is meaningless for an off-screen test harness that reads back from a float render target. Locally the win shows up mostly in waitForPreviousFrame, which was being throttled by the swap-chain pipeline even on an invisible window. All 263/263 tests still pass; no tolerance changes, no DXIL codegen changes (except for a UTF8 fix), no precision risk. Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> * Several small fixes tidying up the recently-added GPU test infrastructure. - Fix unused-variable warnings (fatal on macOS with warnings-as-errors): guard useDxRenderer and useVulkanRenderer declarations with the same ifdefs as their usage sites. useMetalRenderer stays unconditional because it's referenced on all platforms. - Propagate the MSVC+shared-libs PATH workaround to test_vulkan so it can find OpenColorIO_*.dll at runtime, matching what's already done for test_dx. - Upgrade the dxcompiler.dll detection message from STATUS to WARNING and rewrite it to name OCIO_DIRECTX_ENABLED and offer concrete recovery paths. The previous STATUS message was easy to miss, leaving users with a silent degradation until test_dx failed at runtime. - Rename the OpenGL ctest from test_gpu to test_opengl now that sibling backend-specific tests (test_dx, test_vulkan, test_metal) exist. The test_gpu_exec binary keeps its name since it's backend-agnostic and selects via CLI flags. - Declare OCIO_VULKAN_ENABLED as a first-class CMake option with mark_as_advanced, matching the existing OCIO_DIRECTX_ENABLED. It was previously used in conditionals without ever being declared, so it never appeared as a toggle in ccmake/cmake-gui. - Document both OCIO_DIRECTX_ENABLED and OCIO_VULKAN_ENABLED in docs/quick_start/installation.rst, noting that Vulkan requires an external SDK. Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> * Integrate DirectX-Headers with OCIO's external-package pattern Previously InstallDirectXHeaders.cmake was included unconditionally from oglapphelpers/CMakeLists.txt, so DirectX-Headers was always fetched from GitHub regardless of whether the user had a local copy installed. There was no way to use a system install, a vendored copy, or an air-gapped build, and the dep didn't respect OCIO_INSTALL_EXT_PACKAGES. DirectX-Headers is now a first-class OCIO dependency, handled the same way as Imath, ZLIB, yaml-cpp, etc.: try find_package first, fall back to FetchContent only if not found and OCIO_INSTALL_EXT_PACKAGES allows it. Changes: - New share/cmake/modules/FindDirectX-Headers.cmake, modeled on FindImath.cmake. - InstallDirectXHeaders.cmake → InstallDirectX-Headers.cmake (the hyphen matches OCIO's Install convention). - oglapphelpers/CMakeLists.txt now calls ocio_handle_dependency(DirectX-Headers ...) with MIN_VERSION 1.606.0 (Windows SDK 22H2 era — old enough to cover most installed copies) and RECOMMENDED_VERSION 1.619.1 (the version OCIO pins and validates). For users: a local DirectX-Headers install can now be supplied via any of the standard CMake mechanisms — -DDirectX-Headers_DIR, -DDirectX-Headers_ROOT, -DDirectX-Headers_INCLUDE_DIR, or globally with -DOCIO_INSTALL_EXT_PACKAGES=NONE to forbid any network fetch. Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> * Improve dxcompiler.dll diagnostics and allow overriding its path Addresses test crashes seen on stuck Windows 10 hosts caused by an old dxcompiler.dll shipped in that host's Windows SDK Redist. - Print the version of the found dxcompiler.dll at configure time so crash reports identify the exact DXC build without follow-up diagnostics. - Emit a standing hint pointing at the DirectX Shader Compiler releases page, which is the documented workaround. - New -DOCIO_DXCOMPILER_DLL=<path> overrides the Windows SDK Redist search, letting users supply a newer DLL pre-build instead of copying it by hand after. - Extracted the DXC-runtime logic into share/cmake/utils/LocateDXCompilerRuntime.cmake so tests/gpu/CMakeLists.txt stays focused on the test target. Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> * Minor comment tweaks in LocateDXCompilerRuntime.cmake. Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> * Use OCIO_DirectX-Headers_RECOMMENDED_VERSION in InstallDirectX-Headers.cmake ocio_install_dependency already propagates the RECOMMENDED_VERSION from the ocio_handle_dependency call site. Consume it instead of hardcoding the version a second time. Matches the pattern in Installyaml-cpp.cmake and Installpystring.cmake. Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> * Address local cleanup notes from PR AcademySoftwareFoundation#2271 Claude review. * Name CbvSrvHeapSize and throw in setShader if a shader needs more SRV slots than the heap holds. * Guard ~DxApp() so the GPU wait/CloseHandle are skipped when sync objects were never created (constructor partial-init). * Comment the 16-byte float4 stride used when packing UNIFORM_VECTOR_FLOAT/INT arrays into the HLSL constant buffer. * Only record m_windowClassName when RegisterClassExA actually succeeds, so cleanup won't unregister a class owned by another DxApp. * Drop the redundant trailing else in GPUUnitTest.cpp's shadingLanguage selector (initializer already covers it). Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> --------- Signed-off-by: Eric Renaud-Houde <eric.renaud.houde@gmail.com> Co-authored-by: Doug Walker <doug.walker@autodesk.com> Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
ba9f569 to
d24be22
Compare
| if (n <= 0) return ""; | ||
| if (n == 1) return str; | ||
|
|
||
| std::ostringstream os; |
There was a problem hiding this comment.
I wonder if something like the following might be 'better'/faster
std::string repeat_fast(const std::string& input, std::size_t n) {
if (n == 0) {
return {};
}
if (n == 1) {
return input;
}
std::string result;
const auto input_size = input.size();
result.reserve(input_size * n);
result = input;
std::size_t current = 1;
while (current * 2 <= n) {
result += result;
current *= 2;
}
const auto remaining = n - current;
if (remaining > 0) {
result += std::string_view(result.data(), input_size * remaining);
}
return result;
}
Edited because I nerd sniped myself and ran some performance tests
And again avoid an extra allocation
There was a problem hiding this comment.
might want to call the function Repeat or Repeat_n rather than Multiply.
Note that reserve() can raise an exception if we would go over std::string::max_size() but likely we would run out of memory before that happens.
There was a problem hiding this comment.
What a wonderful suggestion. I just want to speak briefly on what I've done so far to get your eyes on it.
- Implemented your function above as
Repeat(). Made some small changes for syntax consistency. - I attempted to put a max_limit check as you've pointed out prior to
reserve(). I chose not to throw an exception but I'm open to suggestions. It is a new behaviour compared to the oldpystring::mul. - I have replaced the single
StringUtils::Multiply()usage found in the code base (which was usingpystring::mul()before). - I did do a quick benchmark testing between Multiply() and Repeat(). I got a string of 23 chars in length and repeated it 1000 times. The difference is significant and I like the pointer arithmetic in this implementation. I'm happy to remove
Multiply()completely in favour ofRepeat()if that is the desired decision.
Multiply() Total Time: 388.173 ms
Multiply() Avg Time: 0.0388174 ms/iter
Repeat() Total Time: 4.99192 ms
Repeat() Avg Time: 0.000499192 ms/iter
| // In place replace the 'search' substring by the 'replace' string in 'str'. | ||
| inline bool ReplaceInPlace(std::string & subject, const std::string & search, const std::string & replace) | ||
| // In place replace the 'search' substring by the 'replace' string in 'subject'. Limited by 'count'. | ||
| inline bool ReplaceInPlace(std::string & subject, const std::string & search, const std::string & replace, int count) |
There was a problem hiding this comment.
count should possibly be a size_t ? if we really need signed values then C++20 has std::ssize_t or we could use std::ptrdiff_t
There was a problem hiding this comment.
Thank you for this suggestion. I've replaced it with size_t as you've suggested to make it consistent with the other size_t usages. As well, I've removed a signed integer comparison ReplaceInPlace() accordingly.
|
|
||
| size_t pos = 0; | ||
| size_t pos = 0; | ||
| int iter = 0; |
There was a problem hiding this comment.
More size_t replacement as suggested!
| } | ||
|
|
||
| // Replace the 'search' substring by the 'replace' string in 'subject'. Limited by 'count'. | ||
| inline std::string Replace(const std::string & subject, const std::string & search, const std::string & replace, int count) |
There was a problem hiding this comment.
More size_t replacement as suggested!
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
| // Check for overflow if n is greater than maximum repeatable amount via string max_size. | ||
| // New behaviour compared to pystring::mul and StringUtil::Multiply. | ||
| // limits.h says size_t max size is 18446744073709551615. | ||
| if (n > str.max_size() / str_size) { return {}; } |
There was a problem hiding this comment.
Ah I didn't mean you needed to detect this case, I don't think you need this. if you remove it then the reserve() call below will deal with it. With this check you could think of it as silently "failing".
There was a problem hiding this comment.
Removed and let it YOLO! When I was testing this, I did mistakenly let it ran and it filled up my swap pretty fast. Also of note is that this is a private function only used in Op's std::string SerializeOpVec(const OpRcPtrVec & ops, int indent=0);. I have reservations about that function's indent parameter as int. As SerializeOpVec is internal and infrequently used, I have switched this to size_t to remove negative integer possibility. If a contributor decided to use Repeat() or SerializeOpVec() with a negative number though, then they're going to have a bad time!
| inline std::string Repeat(const std::string & str, size_t n) | ||
| { | ||
| // Early exit and match pystring::mul behaviour. | ||
| if (n == 0) { return {}; } |
There was a problem hiding this comment.
I've not benchmarked it but it is possible that under some circumstances we might want to check || str.empty() to catch the case of appending an empty string many times. though the extra test probably slows down the function when n is small.
There was a problem hiding this comment.
I agree with you that doing an early check on str.empty() for early return makes sense. I ran a benchmark on both Multiply() and Repeat() with an empty string input and 10,000 repeats. Both Multiply() and Repeat() will return itself and Repeat() just barely beat out Multiply(). I think at this point, I feel pretty good about just removing Multiply() in favour of Repeat() if you agree with it.
Multiply() Total Time: 0.129125 ms
Multiply() Avg Time: 1.29125e-05 ms/iter
Repeat() Total Time: 0.119125 ms
Repeat() Avg Time: 1.19125e-05 ms/iter
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
Signed-off-by: Mei Chu <meimchu@gmail.com>
First step of pystring removal by improving StringUtils (#2256)
StringUtils have been updated to include some pystring functionalities. Focused on replacing:
These usages have also been replaced throughout the modules that use these as well. Only pystring::os usages should be left now.
Additional changes of note:
@KevinJW please take a look when you have a moment! Thank you.