8000 "illegal hardware instruction" on macOS compiled nnn when invoking any plugin · Issue #786 · jarun/nnn · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

"illegal hardware instruction" on macOS compiled nnn when invoking any plugin #786

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
prurph opened this issue Nov 18, 2020 · 51 comments
Closed
Labels

Comments

@prurph
Copy link
Collaborator
prurph commented Nov 18, 2020

Environment details (Put x in the checkbox along with the information)

[x] Operating System: macOS Catalina 10.15.7
[ ] Desktop Environment:
[x] Terminal Emulator: iTerm2 (also occurs on Terminal)
[x] Shell: zsh 5.7.1 (also occurs on bash 5.0.18)
[ ] Custom desktop opener (if applicable):
[ ] Program options used:
[x] Configuration options set: NNN_PLUG='i:ipinfo' NNN_FIFO='/tmp/nnn.fifo'
[x] Issue exists on nnn master

Exact steps to reproduce the issue

  1. Compile nnn with or without flags.
  2. Install plugins to ~/.config/nnn/plugins
  3. Enable any plugin with NNN_PLUG.
  4. Start nnn
  5. Invoke any plugin
  6. nnn crashes, leaving the line: zsh: illegal hardware instruction ./nnn

This does not occur when installing nnn via homebrew. I initially discovered the issue when compiling with O_NERD=1 for the icons, but it occurs without any flags set as well.

Debug output:

$ cat /tmp/nnndbg
ln 7686: VERSION=3.5
ln 7714: home=/Users/p
ln 7445: cfgpath=/Users/p/.config
ln 7451: cfgpath=/Users/p/.config/nnn
ln 7477: selpath=/Users/p/.config/nnn/.selection
ln 7721: opener=/usr/bin/open
ln 7802: getenv(envs[ENV_VISUAL])=(null)
ln 7803: getenv(envs[ENV_EDITOR])=nvim
ln 7804: editor=nvim
ln 7808: pager=less
ln 7812: shell=/bin/zsh
ln 7814: getenv("PWD")=/Users/p/Code/nnn
ln 1712: COLORS=256
ln 1713: COLOR_PAIRS=32767
ln 4972: __FUNCTION__=dentfill
ln 5209: ts2.tv_nsec - ts1.tv_nsec=237000
ln 5712: __FUNCTION__=redraw
ln 5727: path=/Users/p/Code/nnn
@prurph prurph added the bug label Nov 18, 2020
@0xACE
Copy link
Collaborator
0xACE commented Nov 18, 2020

Don't have the time to read the issue atm, but this sounds awfully a lot like #766

OP of that post never gave us a full backtrace, could you maybe provide one so we can have a look at what's going on?

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

Here's bt all from lldb.

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x00007fff6bbbafca libsystem_c.dylib`__chk_fail_overflow.cold.1 + 16
    frame #1: 0x00007fff6bbb8214 libsystem_c.dylib`__chk_fail_overflow + 9
    frame #2: 0x00007fff6bbb843d libsystem_c.dylib`__memccpy_chk + 77
    frame #3: 0x0000000100011979 nnn`run_selected_plugin [inlined] xstrsncpy(dst="", src=<unavailable>, n=1024) at nnn.c:912:14 [opt]
    frame #4: 0x0000000100011956 nnn`run_selected_plugin [inlined] mkpath(dir=<unavailable>, name=<unavailable>, out=<unavailable>) at nnn.c:1032 [opt]
    frame #5: 0x0000000100011902 nnn`run_selected_plugin [inlined] plctrl_init at nnn.c:4716 [opt]
    frame #6: 0x00000001000118ce nnn`run_selected_plugin(path=0x00007ffeefbfdff0, file="ipinfo", runfile="misc", lastname=0x00007ffeefbfdfd0, lastdir=0x00007ffeefbfdfb0) at nnn.c:4807 [opt]
    frame #7: 0x000000010000a0ca nnn`browse(ipath="/Users/prescott/Code/nnn", session=<unavailable>, pkey=0) at nnn.c:6914:10 [opt]
    frame #8: 0x00000001000024fe nnn`main(argc=1, argv=0x00007ffeefbff4f0) at nnn.c:7922:8 [opt]
    frame #9: 0x00007fff6baebcc9 libdyld.dylib`start + 1

Here's frame 3:

CleanShot 2020-11-17 at 21 02 09

And bt full from gdb.

#0  0x00007fff6bbbafca in ?? ()
No symbol table info available.
#1  0x00007ffeefbfde00 in ?? ()
No symbol table info available.
#2  0x00007fff6bbb8214 in ?? ()
No symbol table info available.
#3  0x00007ffeefbfde30 in ?? ()
No symbol table info available.
#4  0x00007fff6bbb843d in ?? ()
No symbol table info available.
#5  0x00007ffeefbfe050 in ?? ()
No symbol table info available.
#6  0x00007ffeefbfe070 in ?? ()
No symbol table info available.
#7  0x000000010001f1b0 in ihashbmp ()
No symbol table info available.
#8  0x0000000104018814 in ?? ()
No symbol table info available.
#9  0x00007ffeefbfde90 in ?? ()
No symbol table info available.
#10 0x0000000100011979 in xstrsncpy (dst=0x10001f1b0 <g_pipepath> "", src=0x1f08000c <error: Cannot access memory at address 0x1f08000c>, n=1024) at src/nnn.c:912
        end = <optimized out>
#11 mkpath (dir=<optimized out>, name=<optimized out>, out=<optimized out>) at src/nnn.c:1032
        len = <optimized out>
#12 plctrl_init () at src/nnn.c:4716
No locals.
#13 run_selected_plugin (path=0x7ffeefbfe070, file=0x1006041c2 "ipinfo", runfile=0x104018814 "misc", lastname=0x7ffeefbfe050, lastdir=0x100000002) at src/nnn.c:4807
        flags = 0 '\000'
        cmd_as_plugin = false
        rfd = <optimized out>
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Happy to help as much as possible but as I'm sure you can tell I'm way out of my depth here!

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

Okay interesting development!

I noticed lldb said "compiled with optimization - stepping may behave oddly ..." but I just did make debug which appears to run with -O3. So I tried recompiling with different levels of optimizations and with -O2 the same issue occurs, but with -O1 the plugin invocation works correctly!

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

I think the tmp path length is the problem. Can you please apply the following patch, compile in debug mode and share the logs:

diff --git a/src/nnn.c b/src/nnn.c
index 7b108b6..247e572 100644
--- a/src/nnn.c
+++ b/src/nnn.c
@@ -7492,6 +7492,8 @@ static bool set_tmp_path(void)
         }
 
         tmpfplen = (uchar)xstrsncpy(g_tmpfpath, path, TMP_LEN_MAX);
+       DPRINTF_S(g_tmpfpath);
+       DPRINTF_U(tmpfplen);
         return TRUE;
 }
 

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

I think you're right. Here's the output:

ln 7688: VERSION=3.5
ln 7495: g_tmpfpat
8000
h=/tmp
ln 7496: tmpfplen=5
ln 7716: home=/Users/prescott
ln 7445: cfgpath=/Users/prescott/.config
ln 7451: cfgpath=/Users/prescott/.config/nnn
ln 7477: selpath=/Users/prescott/.config/nnn/.selection
ln 7723: opener=/usr/bin/open
ln 7804: getenv(envs[ENV_VISUAL])=(null)
ln 7805: getenv(envs[ENV_EDITOR])=nvim
ln 7806: editor=nvim
ln 7810: pager=less
ln 7814: shell=/bin/zsh
ln 7816: getenv("PWD")=/Users/prescott/Code/nnn
ln 1712: COLORS=256
ln 1713: COLOR_PAIRS=32767
ln 4972: __FUNCTION__=dentfill
ln 5209: ts2.tv_nsec - ts1.tv_nsec=118000
ln 5712: __FUNCTION__=redraw
ln 5727: path=/Users/prescott/Code/nnn
ln 6122: newpath=/Users/prescott/Code/nnn/misc
ln 5819: path=/Users/prescott/Code/nnn/misc
ln 4972: __FUNCTION__=dentfill
ln 5209: ts2.tv_nsec - ts1.tv_nsec=125000
ln 5712: __FUNCTION__=redraw
ln 5727: path=/Users/prescott/Code/nnn/misc

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

But the following are the same on my system too!

g_tmpfpath=/tmp
tmpfplen=5

and I don't see this.

Please try this:

diff --git a/src/nnn.c b/src/nnn.c
index 7b108b6..8d25138 100644
--- a/src/nnn.c
+++ b/src/nnn.c
@@ -907,7 +907,7 @@ static void *xrealloc(void *pcur, size_t len)
  * Always null ('\0') terminates if both src and dest are valid pointers.
  * Returns the number of bytes copied including terminating null byte.
  */
-static size_t xstrsncpy(char *restrict dst, const char *restrict src, size_t n)
+static size_t xstrsncpy(char *dst, const char *src, size_t n)
 {
        char *end = memccpy(dst, src, '\0', n);
 

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

The issue still occurs with the above change and the debug log output is the same. One thing I just noticed is that /tmp is a symlink:

$ ls -lah /tmp
lrwxr-xr-x 1 root admin 11 Jul  9 19:57 /tmp -> private/tmp

could this be relevant?

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

I'll try to repro that using a symlink. For now, I'll have to retire. Thanks for all the details. We'll get back if we need anything else.

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

Great, thank you! Compiling with a lower optimization level as a workaround means I'm able to use plugins and enjoy the awesome icons. 😎

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

You can try removing both the const and restrict qualifiers and give it a shot. It's really embarrassing not to have the HW and request for details repeatedly. :(

@0xACE
Copy link
Collaborator
0xACE commented Nov 18, 2020

I went so sleep after my message. I want to say thank you for providing the backtrace <3 it is very helpful.

I'm not familiar with this section of code in particular.

I peeked at the relevant code and it's using a various "shared" variables that I'm also not familiar with...

Therefore I feel it would save a lot of time to test it on the hardware rather than just looking at the code... Either you can hit us up on https://gitter.im/jarun/nnn and we can remotely debug it with tmate. Or I harass my friends for access to a mac ;)

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

I tried the symlink variant now and didn't see the problem. As your tests indicate the compiler is probably handling src specially. I see we can still access g_tmpfpath at line 4715 and probably using a temporary variable may fix this but I would like to understand what is going on for the greater good of the program.

@zmwangx can you please take a look?

@zmwangx
Copy link
Contributor
zmwangx commented Nov 18, 2020

Can repro with -O3 and -O2. -O1 works fine. As does -Os -march=native, which is what I would personally choose. Homebrew strips -O3 and replaces it with something to that effect, except with an explicit arch, e.g. nehalam. The cc shim that does that can be found at /usr/local/Homebrew/Library/Homebrew/shims/mac/super/cc (shims for other compilers and stuff can be found in the same folder). That's why the Homebrew build works.

Aside: You can find the actual compiler invocation at ~/Library/Logs/Homebrew/nnn/01.make.cc if you install from source with brew: brew install -s nnn. It would look like

clang -pipe -w -Os -march=nehalem -std=c11 -o nnn src/nnn.c -lreadline -lncurses -isysroot/Library/Developer/CommandLineTools/SDKs/MacOSX11.0.sdk --sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX11.0.sdk -isystem/usr/local/include -isystem/Library/Developer/CommandLineTools/SDKs/MacOSX11.0.sdk/System/Library/Frameworks/OpenGL.framework/Versions/Current/Headers -I/usr/local/opt/readline/include -L/usr/local/opt/readline/lib -L/usr/local/lib -L/Library/Developer/CommandLineTools/SDKs/MacOSX11.0.sdk/System/Library/Frameworks/OpenGL.framework/Versions/Current/Libraries -Wl,-headerpad_max_install_names

End of aside.

Not sure why even -O2 generates wrong code. Let me know if you want to chase this further, but I probably don't have time for that until the weekend.

@zmwangx
Copy link
Contributor
zmwangx commented Nov 18, 2020

(I only read the top of the thread so I might have been repeating some known things and ignoring some later developments. If there's something I should know, please kindly point me to it.)

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

Not sure why even -O2 generates wrong code.

This is something I want to pursue. Seems something specific to the code generated on macOS. Interesting.

I can surely wait till the weekend for this. :)

@zmwangx
Copy link
Contributor
zmwangx commented Nov 18, 2020

Sure. I'll also read up on what you've found out thus far later.

Btw, if you have the time and are feeling adventurous, I heard there's a trick you can use to SSH into GitHub Actions' macOS instances for some live debugging. Something like this: https://dev.to/retyui/how-debugging-github-actions-with-ssh-273n

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

I tried that already as part of #766 investigation. Didn't see the issue.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

@prurph in the meantime, please check if removing both the const and restrict qualifiers fixes it. If it works, try removing only const.

@zmwangx
Copy link
Contributor
zmwangx commented Nov 18, 2020

Interesting, I wonder what’s different about MacStadium’s Mac Minis. Or maybe it’s something about the hypervisor.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

Also adding @saagarjha for some possible insight into the problem.

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

@jarun removing const and restrict does not fix the issue. (It does generate a lot of compiler warnings about discarding qualifiers for invocations of the function.)

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

OK. So you mean the program didn't even compile for you to test it, right?

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

No, I'm sorry I should have been clearer. It compiled with warnings, but the problem still occurred when plugins were invoked.

All of the warnings were incompatible-pointer-types-discards-qualifiers which I'm assuming is fine since we're discarding the qualifiers in the revised xstrsncpy:

src/nnn.c:1026:25: warning: passing 'const char *' to parameter of type 'char *' discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
                return xstrsncpy(out, name, PATH_MAX);

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

OK. Are you able to see the help screen (?) and any file stat (f)? Those paths use g_tmpfpath too.

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

Interestingly, both the help and file stat screens work correctly.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

OK. Can you try nnn -a?

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

Same result with nnn -a: help/filestat work, plugins crash.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

OK, my understanding is - nnn -a works.

Update: ignore this. I understand what you meant.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

What is the output of the following?

echo $TMPDIR

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

Please try the following patch:

diff --git a/src/nnn.c b/src/nnn.c
index 7b108b6..5a6e91b 100644
--- a/src/nnn.c
+++ b/src/nnn.c
@@ -4710,10 +4710,10 @@ static bool run_cmd_as_plugin(const char *file, char *runfile, uchar flags)
 
 static bool plctrl_init(void)
 {
-       snprintf(g_buf, CMD_LEN_MAX, "nnn-pipe.%d", getpid());
        /* g_tmpfpath is used to generate tmp file names */
        g_tmpfpath[tmpfplen - 1] = '\0';
-       mkpath(g_tmpfpath, g_buf, g_pipepath);
+       size_t r = mkpath(g_tmpfpath, "nnn-pipe.", g_pipepath);
+       xstrsncpy(g_pipepath + r - 1, xitoa(getpid()), TMP_LEN_MAX - r);
        setenv(env_cfg[NNN_PIPE], g_pipepath, TRUE);
 
        return EXIT_SUCCESS;

This follows the same logic that nnn -a uses now which you confirmed works (means nnn didn't crash on startup). Let's see if it works for plctrl_init() as well.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

If this doesn't work, I believe I do have a 100% way of making this work. But let's give the above patch a try first.

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

Unfortunately, the above also doesn't work.

nnn has never crashed on startup originally or with any of the patches--it crashes when any plugin is invoked.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

OK. I think now we need to try what i hinted at earlier.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

Please try the following patch:

diff --git a/src/nnn.c b/src/nnn.c
index 8328718..8ada5ee 100644
--- a/src/nnn.c
+++ b/src/nnn.c
@@ -312,7 +312,6 @@ typedef struct {
 
 /* Non-persistent program-internal states */
 typedef struct {
-	uint pluginit   : 1;  /* Plugin framework initialized */
 	uint interrupt  : 1;  /* Program received an interrupt */
 	uint rangesel   : 1;  /* Range selection on */
 	uint move       : 1;  /* Move operation */
@@ -332,7 +331,7 @@ typedef struct {
 	uint stayonsel	: 1;  /* Disable auto-proceed on select */
 	uint dirctx     : 1;  /* Show dirs in context color */
 	uint uidgid     : 1;  /* Show owner and group info */
-	uint reserved   : 10; /* Adjust when adding/removing a field */
+	uint reserved   : 11; /* Adjust when adding/removing a field */
 } runstate;
 
 /* Contexts or workspaces */
@@ -4708,15 +4707,13 @@ static bool run_cmd_as_plugin(const char *file, char *runfile, uchar flags)
 	return TRUE;
 }
 
-static bool plctrl_init(void)
+static void plctrl_init(void)
 {
-	snprintf(g_buf, CMD_LEN_MAX, "nnn-pipe.%d", getpid());
 	/* g_tmpfpath is used to generate tmp file names */
 	g_tmpfpath[tmpfplen - 1] = '\0';
-	mkpath(g_tmpfpath, g_buf, g_pipepath);
+	size_t r = mkpath(g_tmpfpath, "nnn-pipe.", g_pipepath);
+	xstrsncpy(g_pipepath + r - 1, xitoa(getpid()), TMP_LEN_MAX - r);
 	setenv(env_cfg[NNN_PIPE], g_pipepath, TRUE);
-
-	return EXIT_SUCCESS;
 }
 
 static void rmlistpath()
@@ -4803,11 +4800,6 @@ static bool run_selected_plugin(char **path, const char *file, char *runfile, ch
 	bool cmd_as_plugin = FALSE;
 	uchar flags = 0;
 
-	if (!g_state.pluginit) {
-		plctrl_init();
-		g_state.pluginit = 1;
-	}
-
 	if (*file == '_') {
 		flags = F_MULTI | F_CONFIRM;
 
@@ -7511,8 +7503,7 @@ static void cleanup(void)
 	if (g_state.autofifo)
 		unlink(fifopath);
 #endif
-	if (g_state.pluginit)
-		unlink(g_pipepath);
+	unlink(g_pipepath);
 #ifdef DBGMODE
 	disabledbg();
 #endif
@@ -7813,11 +7804,11 @@ int main(int argc, char *argv[])
 
 	DPRINTF_S(getenv("PWD"));
 
+	plctrl_init();
+
 #ifndef NOFIFO
 	/* Create fifo */
 	if (g_state.autofifo) {
-		g_tmpfpath[tmpfplen - 1] = '\0';
-
 		size_t r = mkpath(g_tmpfpath, "nnn-fifo.", g_buf);
 
 		xstrsncpy(g_buf + r - 1, xitoa(getpid()), PATH_MAX - r);

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

That exact patch yields a compilation error:

cc -DDBGMODE -g -std=c11 -Wall -Wextra -Wshadow -O3  -L/usr/local/opt/libffi/lib -o nnn src/nnn.c -lreadline -lncurses
src/nnn.c:7316:15: error: no member named 'pluginit' in 'runstate'
                if (g_state.pluginit) {
                    ~~~~~~~ ^
1 error generated.
make: *** [nnn] Error 1

I tried this:

@@ -7321,11 +7313,7 @@ malloc_2:
                free(paths[i]);
 malloc_1:
        if (msgnum) {
-               if (g_state.pluginit) {
-                       printmsg(messages[msgnum]);
-                       xdelay(XDELAY_INTERVAL_MS);
-               } else
-                       fprintf(stderr, "%s\n", messages[msgnum]);
+               fprintf(stderr, "%s\n", messages[msgnum]);
        }
        free(input);
        free(paths);

and it compiled, but nnn crashed immediately on startup this time with the following backtrace:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x00007fff6bbbafca libsystem_c.dylib`__chk_fail_overflow.cold.1 + 16
    frame #1: 0x00007fff6bbb8214 libsystem_c.dylib`__chk_fail_overflow + 9
    frame #2: 0x00007fff6bbb843d libsystem_c.dylib`__memccpy_chk + 77
    frame #3: 0x0000000100001c4b nnn`main [inlined] xstrsncpy(dst="", src=<unavailable>, n=1024) at nnn.c:911:14 [opt]
    frame #4: 0x0000000100001c28 nnn`main [inlined] mkpath(dir=<unavailable>, name=<unavailable>, out=<unavailable>) at nnn.c:1031 [opt]
    frame #5: 0x0000000100001c10 nnn`main [inlined] plctrl_init at nnn.c:4714 [opt]
    frame #6: 0x0000000100001bfd nnn`main(argc=1, argv=0x00007ffeefbff4e0) at nnn.c:7805 [opt]
    frame #7: 0x00007fff6baebcc9 libdyld.dylib`start + 1

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

My previous commit removes the flag so if you update to master the it would work.

The crash is strange! You mentioned nnn -a didn't crash at startup. The code is almost the same as the block within (g_state.autofifo) { below. Let me move the plugin init out of the static function.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

Wait, there's some issue...

    frame #3: 0x0000000100001c4b nnn`main [inlined] xstrsncpy(dst="", src=<unavailable>, n=1024) at nnn.c:911:14 [opt]
    frame #4: 0x0000000100001c28 nnn`main [inlined] mkpath(dir=<unavailable>, name=<unavailable>, out=<unavailable>) at nnn.c:1031 [opt]
    frame #5: 0x0000000100001c10 nnn`main [inlined] plctrl_init at nnn.c:4714 [opt]

plctrl_init() didn't have any mkpath() call in the patch. Did you apply the patch correctly?

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

Sorry, I guess I am sleepy. Ignore the above comment. Let's try the next version.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

Please upgrade to master and try the following patch:

diff --git a/src/nnn.c b/src/nnn.c
index 8328718..f129ca3 100644
--- a/src/nnn.c
+++ b/src/nnn.c
@@ -312,7 +312,6 @@ typedef struct {
 
 /* Non-persistent program-internal states */
 typedef struct {
-	uint pluginit   : 1;  /* Plugin framework initialized */
 	uint interrupt  : 1;  /* Program received an interrupt */
 	uint rangesel   : 1;  /* Range selection on */
 	uint move       : 1;  /* Move operation */
@@ -332,7 +331,7 @@ typedef struct {
 	uint stayonsel	: 1;  /* Disable auto-proceed on select */
 	uint dirctx     : 1;  /* Show dirs in context color */
 	uint uidgid     : 1;  /* Show owner and group info */
-	uint reserved   : 10; /* Adjust when adding/removing a field */
+	uint reserved   : 11; /* Adjust when adding/removing a field */
 } runstate;
 
 /* Contexts or workspaces */
@@ -441,13 +440,13 @@ static struct sigaction oldsighup;
 static struct sigaction oldsigtstp;
 
 /* For use in functions which are isolated and don't return the buffer */
-static char g_buf[CMD_LEN_MAX] __attribute__ ((aligned));
+static char g_buf[CMD_LEN_MAX];
 
 /* Buffer to store tmp file path to show selection, file stats and help */
-static char g_tmpfpath[TMP_LEN_MAX] __attribute__ ((aligned));
+static char g_tmpfpath[TMP_LEN_MAX];
 
 /* Buffer to store plugins control pipe location */
-static char g_pipepath[TMP_LEN_MAX] __attribute__ ((aligned));
+static char g_pipepath[TMP_LEN_MAX];
 
 /* Non-persistent runtime states */
 static runstate g_state;
@@ -4708,17 +4707,6 @@ static bool run_cmd_as_plugin(const char *file, char *runfile, uchar flags)
 	return TRUE;
 }
 
-static bool plctrl_init(void)
-{
-	snprintf(g_buf, CMD_LEN_MAX, "nnn-pipe.%d", getpid());
-	/* g_tmpfpath is used to generate tmp file names */
-	g_tmpfpath[tmpfplen - 1] = '\0';
-	mkpath(g_tmpfpath, g_buf, g_pipepath);
-	setenv(env_cfg[NNN_PIPE], g_pipepath, TRUE);
-
-	return EXIT_SUCCESS;
-}
-
 static void rmlistpath()
 {
 	if (listpath) {
@@ -4803,11 +4791,6 @@ static bool run_selected_plugin(char **path, const char *file, char *runfile, ch
 	bool cmd_as_plugin = FALSE;
 	uchar flags = 0;
 
-	if (!g_state.pluginit) {
-		plctrl_init();
-		g_state.pluginit = 1;
-	}
-
 	if (*file == '_') {
 		flags = F_MULTI | F_CONFIRM;
 
@@ -7511,8 +7494,7 @@ static void cleanup(void)
 	if (g_state.autofifo)
 		unlink(fifopath);
 #endif
-	if (g_state.pluginit)
-		unlink(g_pipepath);
+	unlink(g_pipepath);
 #ifdef DBGMODE
 	disabledbg();
 #endif
@@ -7536,6 +7518,7 @@ int main(int argc, char *argv[])
 #ifndef NORL
 	bool rlhist = FALSE;
 #endif
+	size_t r;
 
 	while ((opt = (env_opts_id > 0
 		       ? env_opts[--env_opts_id]
@@ -7813,14 +7796,19 @@ int main(int argc, char *argv[])
 
 	DPRINTF_S(getenv("PWD"));
 
+	/* g_tmpfpath is used to generate tmp file names */
+	fd = getpid();
+	g_tmpfpath[tmpfplen - 1] = '\0';
+	r = mkpath(g_tmpfpath, "nnn-pipe.", g_pipepath);
+	xstrsncpy(g_pipepath + r - 1, xitoa(fd), TMP_LEN_MAX - r);
+	setenv(env_cfg[NNN_PIPE], g_pipepath, TRUE);
+
 #ifndef NOFIFO
 	/* Create fifo */
 	if (g_state.autofifo) {
-		g_tmpfpath[tmpfplen - 1] = '\0';
-
-		size_t r = mkpath(g_tmpfpath, "nnn-fifo.", g_buf);
+		r = mkpath(g_tmpfpath, "nnn-fifo.", g_buf);
 
-		xstrsncpy(g_buf + r - 1, xitoa(getpid()), PATH_MAX - r);
+		xstrsncpy(g_buf + r - 1, xitoa(fd), PATH_MAX - r);
 		setenv("NNN_FIFO", g_buf, TRUE);
 	}

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

Crashes on startup:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x00007fff6bbbafca libsystem_c.dylib`__chk_fail_overflow.cold.1 + 16
    frame #1: 0x00007fff6bbb8214 libsystem_c.dylib`__chk_fail_overflow + 9
    frame #2: 0x00007fff6bbb843d libsystem_c.dylib`__memccpy_chk + 77
    frame #3: 0x0000000100001bb4 nnn`main [inlined] xstrsncpy(dst="", src=<unavailable>, n=1024) at nnn.c:911:14 [opt]
    frame #4: 0x0000000100001b91 nnn`main [inlined] mkpath(dir=<unavailable>, name=<unavailable>, out=<unavailable>) at nnn.c:1031 [opt]
    frame #5: 0x0000000100001b79 nnn`main(argc=1, argv=0x00007ffeefbff4d0) at nnn.c:7802 [opt]
    frame #6: 0x00007fff6baebcc9 libdyld.dylib`start + 1

@saagarjha
Copy link
Contributor

This looks like a buffer overflow. Try turning in Address Sanitizer to see where it is.

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

When I turn on address sanitizer nnn starts properly AND the original issue disappears: I'm able to invoke plugins and they work instead of crashing nnn.

Just to be sure I'm doing this right, this is the invocation I used: (exactly what make debug does but with fsanitize=address added)

gcc -DDBGMODE -g -std=c11 -Wall -Wextra -Wshadow -O3 -fsanitize=address  -L/usr/local/opt/libffi/lib -o nnn src/nnn.c -lreadline -lncurses

so it seems like whatever changes the address sanitizer makes also "fix" things.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

Not sure if this is because of the length of the globals. I can try to allocate the 64 bye ones dynamically and see how it goes.

@jarun
Copy link
Owner
jarun commented Nov 18, 2020

Please try this version (last one):

diff --git a/src/nnn.c b/src/nnn.c
index 8328718..7311620 100644
--- a/src/nnn.c
+++ b/src/nnn.c
@@ -312,7 +312,6 @@ typedef struct {
 
 /* Non-persistent program-internal states */
 typedef struct {
-	uint pluginit   : 1;  /* Plugin framework initialized */
 	uint interrupt  : 1;  /* Program received an interrupt */
 	uint rangesel   : 1;  /* Range selection on */
 	uint move       : 1;  /* Move operation */
@@ -332,7 +331,7 @@ typedef struct {
 	uint stayonsel	: 1;  /* Disable auto-proceed on select */
 	uint dirctx     : 1;  /* Show dirs in context color */
 	uint uidgid     : 1;  /* Show owner and group info */
-	uint reserved   : 10; /* Adjust when adding/removing a field */
+	uint reserved   : 11; /* Adjust when adding/removing a field */
 } runstate;
 
 /* Contexts or workspaces */
@@ -415,6 +414,9 @@ static char *listroot;
 static char *plgpath;
 static char *pnamebuf, *pselbuf;
 static char *mark;
+static char *g_tmpfpath;
+static char *g_pipepath;
+
 #ifndef NOFIFO
 static char *fifopath;
 #endif
@@ -441,13 +443,7 @@ static struct sigaction oldsighup;
 static struct sigaction oldsigtstp;
 
 /* For use in functions which are isolated and don't return the buffer */
-static char g_buf[CMD_LEN_MAX] __attribute__ ((aligned));
-
-/* Buffer to store tmp file path to show selection, file stats and help */
-static char g_tmpfpath[TMP_LEN_MAX] __attribute__ ((aligned));
-
-/* Buffer to store plugins control pipe location */
-static char g_pipepath[TMP_LEN_MAX] __attribute__ ((aligned));
+static char g_buf[CMD_LEN_MAX];
 
 /* Non-persistent runtime states */
 static runstate g_state;
@@ -4708,17 +4704,6 @@ static bool run_cmd_as_plugin(const char *file, char *runfile, uchar flags)
 	return TRUE;
 }
 
-static bool plctrl_init(void)
-{
-	snprintf(g_buf, CMD_LEN_MAX, "nnn-pipe.%d", getpid());
-	/* g_tmpfpath is used to generate tmp file names */
-	g_tmpfpath[tmpfplen - 1] = '\0';
-	mkpath(g_tmpfpath, g_buf, g_pipepath);
-	setenv(env_cfg[NNN_PIPE], g_pipepath, TRUE);
-
-	return EXIT_SUCCESS;
-}
-
 static void rmlistpath()
 {
 	if (listpath) {
@@ -4803,11 +4788,6 @@ static bool run_selected_plugin(char **path, const char *file, char *runfile, ch
 	bool cmd_as_plugin = FALSE;
 	uchar flags = 0;
 
-	if (!g_state.pluginit) {
-		plctrl_init();
-		g_state.pluginit = 1;
-	}
-
 	if (*file == '_') {
 		flags = F_MULTI | F_CONFIRM;
 
@@ -7491,6 +7471,9 @@ static bool set_tmp_path(void)
                 return FALSE;
         }
 
+	g_tmpfpath = (char *)calloc(1, TMP_LEN_MAX);
+	if (!g_tmpfpath)
+		return FALSE;
         tmpfplen = (uchar)xstrsncpy(g_tmpfpath, path, TMP_LEN_MAX);
         return TRUE;
 }
@@ -7511,8 +7494,9 @@ static void cleanup(void)
 	if (g_state.autofifo)
 		unlink(fifopath);
 #endif
-	if (g_state.pluginit)
-		unlink(g_pipepath);
+	unlink(g_pipepath);
+	free(g_pipepath);
+	free(g_tmpfpath);
 #ifdef DBGMODE
 	disabledbg();
 #endif
@@ -7536,6 +7520,7 @@ int main(int argc, char *argv[])
 #ifndef NORL
 	bool rlhist = FALSE;
 #endif
+	size_t r;
 
 	while ((opt = (env_opts_id > 0
 		       ? env_opts[--env_opts_id]
@@ -7813,14 +7798,22 @@ int main(int argc, char *argv[])
 
 	DPRINTF_S(getenv("PWD"));
 
+	/* g_tmpfpath is used to generate tmp file names */
+	fd = getpid();
+	g_tmpfpath[tmpfplen - 1] = '\0';
+	g_pipepath = (char *)calloc(1, TMP_LEN_MAX);
+	if (!g_pipepath)
+		return EXIT_FAILURE;
+	r = mkpath(g_tmpfpath, "nnn-pipe.", g_pipepath);
+	xstrsncpy(g_pipepath + r - 1, xitoa(fd), TMP_LEN_MAX - r);
+	setenv(env_cfg[NNN_PIPE], g_pipepath, TRUE);
+
 #ifndef NOFIFO
 	/* Create fifo */
 	if (g_state.autofifo) {
-		g_tmpfpath[tmpfplen - 1] = '\0';
-
-		size_t r = mkpath(g_tmpfpath, "nnn-fifo.", g_buf);
+		r = mkpath(g_tmpfpath, "nnn-fifo.", g_buf);
 
-		xstrsncpy(g_buf + r - 1, xitoa(getpid()), PATH_MAX - r);
+		xstrsncpy(g_buf + r - 1, xitoa(fd), PATH_MAX - r);
 		setenv("NNN_FIFO", g_buf, TRUE);
 	}

If this doesn't work, we'll have to have someone work on it on his system.

@prurph
Copy link
Collaborator Author
prurph commented Nov 18, 2020

The above patch causes an immediate crash with a similar backtrace:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
 * frame #0: 0x00007fff6bbbafca libsystem_c.dylib`__chk_fail_overflow.cold.1 + 16
   frame #1: 0x00007fff6bbb8214 libsystem_c.dylib`__chk_fail_overflow + 9
   frame #2: 0x00007fff6bbb843d libsystem_c.dylib`__memccpy_chk + 77
   frame #3: 0x0000000100001b5d nnn`main [inlined] xstrsncpy(dst="", src="/tmp", n=1024) at nnn.c:907:14 [opt]
   frame #4: 0x0000000100001b45 nnn`main [inlined] mkpath(dir="/tmp", name=<unavailable>, out="") at nnn.c:1027 [opt]
   frame #5: 0x0000000100001b30 nnn`main(argc=1, argv=0x00007ffeefbff4d0) at nnn.c:7807 [opt]
   frame #6: 0x00007fff6baebcc9 libdyld.dylib`start +

In the interim, might be worth adding a note to the wiki that mac users might want to compile with the settings in the above comment: #786 (comment) if they run into problems.

@jarun
Copy link
Owner
jarun commented Nov 19, 2020

I think I understood the problem. The buffers are fine. It's the length 1024. As both the buffers are only 64 bytes they probably start close (maybe from the same page too) enough to be detected as overlapping.

@jarun
Copy link
Owner
jarun commented Nov 19, 2020

The following patch should work for you:

iff --git a/src/nnn.c b/src/nnn.c
index 8328718..4b1bae5 100644
--- a/src/nnn.c
+++ b/src/nnn.c
@@ -1016,6 +1016,8 @@ static inline bool getutil(char *util)
 /*
  * Updates out with "dir/name or "/name"
  * Returns the number of bytes copied including the terminating NULL byte
+ *
+ * Note: dir and out must be PATH_MAX in length to avoid macOS fault
  */
 static size_t mkpath(const char *dir, const char *name, char *out)
 {
@@ -4710,12 +4712,18 @@ static bool run_cmd_as_plugin(const char *file, char *runfile, uchar flags)
 
 static bool plctrl_init(void)
 {
-       snprintf(g_buf, CMD_LEN_MAX, "nnn-pipe.%d", getpid());
+       size_t len;
+
        /* g_tmpfpath is used to generate tmp file names */
        g_tmpfpath[tmpfplen - 1] = '\0';
-       mkpath(g_tmpfpath, g_buf, g_pipepath);
+       len = xstrsncpy(g_pipepath, g_tmpfpath, TMP_LEN_MAX);
+       g_pipepath[len - 1] = '/';
+       len = xstrsncpy(g_pipepath + len, "nnn-pipe.", TMP_LEN_MAX - len) + len;
+       xstrsncpy(g_pipepath + len - 1, xitoa(getpid()), TMP_LEN_MAX - len);
        setenv(env_cfg[NNN_PIPE], g_pipepath, TRUE);
 
+       DPRINTF_S(g_pipepath);
+
        return EXIT_SUCCESS;
 }

@jarun
Copy link
Owner
jarun commented Nov 19, 2020

For later reference, found a similar defect here.

@jarun jarun closed this as completed in a443a32 Nov 19, 2020
@jarun jarun mentioned this issue Nov 19, 2020
21 tasks
@prurph
Copy link
Collaborator Author
prurph commented Nov 19, 2020

The issue is fixed in master now, thank you!

@jarun
Copy link
Owner
jarun commented Nov 19, 2020

No, thank you so much for reporting the issue!

@jarun
Copy link
Owner
jarun commented Nov 19, 2020

And running the tests.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants
0