Description
I'm submitting a...
[ ] Regression (a behavior that used to work and stopped working in a new release)
[X ] Bug report
[ ] Performance issue
[] Feature request
[ ] Documentation issue or request
[ ] Support request => Please do not submit support request here, instead see
[ ] Other... Please describe:
Current behavior
When running AeronArchive.listRecordings(0, 100, consumer) every one second, io.aeron.archive.ArchivingMediaDriver and io.aeron.samples.archive.RecordedBasicPublisher concurrently, it will cause list recordings method to take longer and longer each query and eventually hit a timeout(Of 5 second). This becomes obvious when the number of archives is more than 8.
Expected behavior
To be able to query at least 100 archive recordings details in less than a second consistently when RecordedBasicPublisher is running concurrently.
Minimal reproduction of the problem with instructions
-
git clone https://github.com/real-logic/aeron.git
-
Add the below file to the follow directory
{CLONE DIRECTORY}/aeron-samples/src/main/java/io/aeron/samples
package io.aeron.samples;
import io.aeron.archive.client.AeronArchive;
import io.aeron.archive.client.RecordingDescriptorConsumer;
import org.agrona.concurrent.SigInt;
import java.util.concurrent.atomic.AtomicBoolean;
public class TestSample
{
public static void printArchiveLogs(final AeronArchive archive)
{
final RecordingDescriptorConsumer consumer =
(controlSessionId,
correlationId,
recordingId,
startTimestamp,
stopTimestamp,
startPosition,
stopPosition,
initialTermId,
segmentFileLength,
termBufferLength,
mtuLength,
sessionId,
streamId,
strippedChannel,
originalChannel,
sourceIdentity) ->
{
System.out.format("[recordingId]: %d, " +
"[Timestamp]: [%d, %d], [startPosition]: %d, [stopPosition]: %d, [initialTermId]: %d, " +
"[segmentFileLength]: %d, [sessionId]: %d, " +
"[streamId]: %d, [originalChannel]: %s" +
"[SourceIdentity]: %s\n",
recordingId, startTimestamp, stopTimestamp,
startPosition, stopPosition, initialTermId,
segmentFileLength, sessionId, streamId,
originalChannel, sourceIdentity);
};
//Print 100k recordings can be parameterize
final long fromRecordingId = 0L;
final int recordCount = 100;
final int foundCount = archive.listRecordings(fromRecordingId, recordCount, consumer);
System.out.println("Number of recording is: " + foundCount);
}
public static void main(final String[] args) throws Exception
{
final AtomicBoolean running = new AtomicBoolean(true);
SigInt.register(() -> running.set(false));
try (AeronArchive archive = AeronArchive.connect())
{
while (running.get())
{
TestSample.printArchiveLogs(archive);
Thread.sleep(1000);
}
}
catch (final Exception e)
{
System.out.println("Something Went Wrong");
System.out.println(e);
return;
}
}
}
- Run ./gradew
- In your aeron directory run
java -cp aeron-samples/build/libs/samples.jar io.aeron.archive.ArchivingMediaDriver
- In your aeron directory run
java -cp aeron-samples/build/libs/samples.jar io.aeron.samples.archive.RecordedBasicPublisher
-
Stop the program in RecordedBasicPublisher(Step 5) and rerun it 8 times to get 8 archives. Stop the program when after it sends 15-20 message each time. Leave the last instance running.
-
In your aeron directory run
java -cp aeron-samples/build/libs/samples.jar -Daeron.spi
5425
es.simulate.connection=true io.aeron.samples.TestSample
- You should be able to see that it progressively slows down by looking at the print. And eventually it will hit a timeout at about 10 minutes mark.
Which versions of Aeron, OS, Java are affected?
Aeron - 1.11.2
Java version
java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
Operating System
NAME="Ubuntu"
VERSION="18.04.1 LTS (Bionic Beaver)"