Tags: apache/cloudberry
Tags
Apache Cloudberry (Incubating) 2.0.0-incubating-rc1 Release Candidate
Reimplement copy from directory table. In this commit, we reimplement the copy from directory table by transfering file between QD and QE block by block which transfers the whole file before. By this way, we will avoid memory out of error when palloc memory to store the whole file. Authored-by: Zhang Wenchao zwcpostgres@gmail.com
Reduce flakiness in test fts_segment_reset Have seen some flakiness in test fts_segment_reset because sometimes FTS would still promote mirror if the primary takes a bit longer to restart after getting out of RESET stage. An example like below: - Primary 0 gets out of RESET and was going to be restarted: 2022-05-23 15:32:53.924540 UTC,,,p105578,th1560833280,,,,0,,,seg0,,,,,"LOG","00000","all server processes terminated; reinitializing",,,,,,,0,,"postmaster.c",4284, - And it takes primary 0 about 2-3 seconds to do so: 2022-05-23 15:32:56.184117 UTC,,,p105578,th1560833280,,,,0,,,seg0,,,,,"LOG","00000","database system is ready to accept connections” - Unfortunately before primary 0 could restart, FTS makes one last probe and finds that it is in recovery mode, and not making progress (which is "correct" because primary 0 has finished recovery): 2022-05-23 15:32:56.009206 UTC,,,p102591,th2023709952,,,,0,con3,,seg-1,,,,,"LOG","00000","FTS: detected segment is in recovery mode and not making progress (content=0) primary dbid=2, mirror dbid=5",,,,,,,0,,"ftsprobe.c",254, 2022-05-23 15:32:56.065399 UTC,,,p102591,th2023709952,,,,0,con3,,seg-1,,,,,"LOG","00000","FTS max (5) retries exhausted (content=0, dbid=2) state=9",,,,,,,0,,"ftsprobe.c”,788 Currently, we let primary stay in the RESET stage fo 8000 r 27 seconds. The FTS has a default of 5-second retry cycle, at the end of which it makes promote decision. That leaves about 3 seconds for the primary to start after getting out of RESET, which is probably too short. Now make the retry cycle 15 seconds and let the RESET delay to be 17 seconds. That leave about 13 seconds for the primary to start after that, which should be well enough to reduce common flakiness.
PreviousNext