[Time] Name | Message |
[02:25] andrewvc
|
chuck
|
[02:25] andrewvc
|
err
|
[02:25] andrewvc
|
cremes: around?
|
[02:25] andrewvc
|
I think I found the bug, it's in ffi-rzmq I believe
|
[06:41] yrashk
|
under certain circumstances I am getting some weird segfault in zmq_recv (I use it in an Erlang NIF, and it works fine until I create a so called 'release' for my erlang app)
|
[06:41] yrashk
|
https://gist.github.com/2b197cc5718a9f6452e6
|
[06:41] yrashk
|
peer_identity = 0x870e5bfa <Address 0x870e5bfa out of bounds>
|
[06:41] yrashk
|
looks suspicious
|
[06:41] yrashk
|
any ideas what to do to figure out the source of the problem
|
[06:41] yrashk
|
happens only in that beforementioned setting and only on osx
|
[06:41] yrashk
|
works fine on linux
|
[06:45] yrashk
|
sustrik: ^^^
|
[06:51] yrashk
|
and yes, this is 2.1.0 and master
|
[07:31] sustrik
|
yrashk: aren't you using same socket from multiple threads?
|
[07:32] yrashk
|
well, all sockets are created in thread #1
|
[07:32] yrashk
|
then I use push socket from a thread #1
|
[07:32] yrashk
|
and a pul socket from thread #2
|
[07:32] yrashk
|
pull*
|
[07:33] yrashk
|
will that result in an undefined behaviour?
|
[07:35] sustrik
|
if passing of the socket from #1 to #2 is well synchronised, then it should be ok
|
[07:36] yrashk
|
Each ÃMQ socket belonging to a particular context may only be used by the thread that created it using zmq_socket().
|
[07:37] yrashk
|
I might have missed this
|
[07:37] yrashk
|
so the documentation says I have to use the socket form a thread that created it?
|
[07:37] yrashk
|
from*
|
[07:37] sustrik
|
with 2.1 the restriction is alleviated
|
[07:37] yrashk
|
k
|
[07:37] sustrik
|
you *can* pass sockets between threads
|
[07:38] sustrik
|
but you can't access a socket from 2 threads in parallel
|
[07:38] yrashk
|
I am never touching that pull socket from the 1st thread
|
[07:38] yrashk
|
except for its initialization
|
[07:39] sustrik
|
how do you pass the socket to the other thread?
|
[07:40] yrashk
|
it just lives in a class instance that 1st thread creates
|
[07:42] yrashk
|
after that 1s thread creates a 2nd thread and passes that object as an argument to a thread function
|
[07:42] sustrik
|
ok, i see
|
[07:42] sustrik
|
that should be ok
|
[07:43] yrashk
|
is there anything else that might result in that segfault? specifically on OSX
|
[07:43] yrashk
|
as Linux build seems to work just fine
|
[07:43] yrashk
|
at least I am yet to see a single crash in the same scenario
|
[07:43] yrashk
|
while OSX build crashes 80-90% times
|
[07:44] sustrik
|
well, what you have shown me looks like a socket have been closed while still being used
|
[07:44] sustrik
|
either from another thread
|
[07:44] sustrik
|
or from a single thread this way: zmq_close(x);zmq_recv(x,...);
|
[07:45] sustrik
|
do you have a minimal test case?
|
[07:45] yrashk
|
not yet
|
[07:45] yrashk
|
the setting is pretty complicated
|
[07:45] sustrik
|
ok
|
[07:45] yrashk
|
as it works just fine when not packaged as an erlang app release
|
[07:47] yrashk
|
I am just trying to think of any possible explanation of this segfault
|
[07:48] sustrik
|
library mismatch?
|
[07:48] sustrik
|
the recv that fails, is it the first recv called?
|
[07:50] yrashk
|
nope
|
[07:50] yrashk
|
normally anywhere from like 4 calls to about, say, 20
|
[07:52] sustrik
|
memory overwrite then?
|
[07:52] sustrik
|
it's just a guesswork
|
[07:53] yrashk
|
well may be
|
[07:54] yrashk
|
but I am not quite positive how can this happen
|
[07:54] yrashk
|
and only when the app is packaged
|
[07:54] sustrik
|
it'a a segfault, right?
|
[07:55] sustrik
|
does it print out the address it segfaults at?
|
[07:56] yrashk
|
it is
|
[07:57] yrashk
|
https://gist.github.com/63c48785c8f42c6c1f8b
|
[07:58] sustrik
|
i don't see the notification about the segfault there
|
[07:58] sustrik
|
i mean what address it tries to access that is out of bounds?
|
[07:59] yrashk
|
peer_identity = 0x870e5bfa <Address 0x870e5bfa out of bounds>
|
[07:59] yrashk
|
I guess this is it
|
[08:00] sustrik
|
nope
|
[08:00] sustrik
|
it prints "segfault" somewhere
|
[08:00] sustrik
|
is there an address mentioned there?
|
[08:01] yrashk
|
it didn't print any address
|
[08:02] yrashk
|
anyway I think you led me into something
|
[08:04] yrashk
|
thanks a lot!
|
[08:07] yrashk
|
I really appreciate your help, sustrik -- I think I got it
|
[08:08] sustrik
|
what's the problem?
|
[08:09] yrashk
|
apparently due to the hack I used to ensure a NIF module is loaded, I was calling initializion of that NIF twice
|
[08:09] yrashk
|
and it rewrote the context
|
[08:09] yrashk
|
facepalm
|
[08:09] yrashk
|
absolute facepalm
|
[08:10] yrashk
|
well I didn't check that it actually overwrote the context
|
[08:10] yrashk
|
but it is fairly trivial to guess that from the code
|
[08:10] yrashk
|
because it's my code
|
[08:10] yrashk
|
:]
|
[08:12] sustrik
|
:)
|
[10:00] pieterh
|
hi guys
|
[10:00] pieterh
|
seems our email server was out of action for a while
|
[10:00] pieterh
|
it looks like stuff was queued but not being sent out
|
[10:01] pieterh
|
this would have affected the zeromq-dev list presumably
|
[10:01] pieterh
|
anyhow, I rebooted the beast and emails are now slowly appearing
|
[13:57] pieterh
|
Anyone hitting "Successful WSASTARTUP not yet performed (c:\work\src\zeromq2\src\mailbox.cpp:263)" on Windows
|
[13:57] pieterh
|
?
|
[14:18] Guthur
|
pieterh, never seen such an error myself
|
[14:18] Guthur
|
which windows version?
|
[14:21] pieterh
|
Guthur: I'm running WinXP, this hits at zmq_init()
|
[14:21] pieterh
|
Seems new to 2.1.0
|
[14:22] Guthur
|
I've be running 2.1.0 ok recently, WinXP as well
|
[14:22] Guthur
|
sorry, that's not really helpful for you though
|
[14:23] pieterh
|
I'll check if zmq is doing WSAStartup or not...
|
[14:24] Guthur
|
i'm looking at IOCP today for ICP on win
|
[14:24] Guthur
|
not sure if I will have it ready this weekend though, it's not very well documented
|
[14:25] Guthur
|
ICP/IPC
|
[14:27] pieterh
|
getting ipc: to work on win32 would be great
|
[14:28] pieterh
|
for some reason I'm getting zmq calling make_socketpair before doing WSAStartup... strange
|
[14:37] sustrik
|
pieterh: what version are you using?
|
[14:37] pieterh
|
latest from github
|
[14:37] pieterh
|
stepping though, it definitely tries to create a mailbox socket pair before doing WSAStartup
|
[14:38] pieterh
|
C++ is a joy to understand
|
[14:38] sustrik
|
how come?
|
[14:38] sustrik
|
see ctx.cpp
|
[14:38] pieterh
|
I am staring at it :-)
|
[14:38] sustrik
|
ctx_t constructor
|
[14:38] sustrik
|
line 36
|
[14:38] pieterh
|
any specific line no?
|
[14:39] sustrik
|
the very thing fitst done is WSAStartup
|
[14:39] pieterh
|
well, line 36 is a blank line here
|
[14:39] pieterh
|
yes, the very first thing it does is WSAStartup
|
[14:40] sustrik
|
that's zmq_init() implementation
|
[14:40] pieterh
|
ok, when I debug it step by step, I get...
|
[14:40] pieterh
|
(hang on, it'll take me a second...)
|
[14:41] Guthur
|
cool, just got feedback of some users of zmq2 and clrzmq2 being used as core tech, sweet
|
[14:42] Guthur
|
good to be getting feedback on some field tests
|
[14:42] pieterh
|
sustrik: array, vector, mutex_t, vector, mailbox constructors before it does first line of ctx constructor
|
[14:42] pieterh
|
that is, after calling ctx constructor from zmq_init...
|
[14:42] sustrik
|
wait a sec, checking...
|
[14:42] pieterh
|
Guthur: saw that on twitter... nice
|
[14:43] sustrik
|
ok, got it
|
[14:43] sustrik
|
let me fix it
|
[14:43] pieterh
|
excellent, can you explain what it's doing?
|
[14:43] Guthur
|
pieterh, they already caught a couple of bugs, so paying dividends already
|
[14:43] Guthur
|
bugs in clrzmq2
|
[14:44] pieterh
|
It's nice to have users :-)
|
[14:44] sustrik
|
constructors of embedded object are called *before* the constructor of the main object
|
[14:44] pieterh
|
sustrik: ah, and mailbox is embedded I guess
|
[14:44] sustrik
|
ctx_t has a member called term_mailbox
|
[14:44] pieterh
|
right
|
[14:44] sustrik
|
right
|
[14:45] pieterh
|
you could move the WSAStartup code to zmq_init
|
[14:45] sustrik
|
yes, i should
|
[14:45] pieterh
|
well, let me try that, test it, submit a patch
|
[14:45] sustrik
|
goodo
|
[14:45] pieterh
|
it'll take me 3 minutes...
|
[14:46] sustrik
|
also, to retain symetricity, move WSACleanup to zmq_term()
|
[14:50] zchrish
|
In C++, should a context ever be introduced in a place other than in main()?
|
[14:52] zchrish
|
I am trying to design a thread management strategy to incorporate some sort of error management. Will a context ever become corrupted?
|
[14:59] pieterh
|
sustrik: ok, fixed and tested, sending patch now
|
[15:04] pieterh
|
zchrish: you can create a context anywhere you like but two threads that want to communicate via inproc: must share the same context
|
[15:04] pieterh
|
so the natural place is usually where you create child threads, which is usually main()
|
[15:05] pieterh
|
and no, a context will not become corrupted unless your application overwrites memory erroneously
|
[15:05] zchrish
|
OK; thanks.
|
[15:17] sustrik
|
pieterh: please, sign-off the patch
|
[15:17] sustrik
|
(commit -s)
|
[15:40] Guthur
|
sustrik, he has left
|
[15:54] sustrik
|
ah, missed that, thanks
|
[15:55] Guthur
|
sustrik, question about the IOCP integration...
|
[15:55] sustrik
|
sure
|
[15:55] sustrik
|
go on
|
[15:55] Guthur
|
will the zmq engine be able to call PostQueuedCompletionStatus on socket recvs and sends
|
[15:56] sustrik
|
?
|
[15:56] Guthur
|
i could be missing something but that seems to be how IOCP works, it's just a means of syncing stuff
|
[15:57] Guthur
|
so the polling object calls GetQueuedCompletionStatusEx to get any signalled events
|
[15:58] Guthur
|
but i'm having trouble seeing where these would get signalled form
|
[15:58] Guthur
|
from*
|
[15:58] sustrik
|
i would say the NamesPipe would signal it
|
[15:58] sustrik
|
you don't need to do that yourself
|
[15:58] Guthur
|
that's what i was initially thinking too
|
[15:59] Guthur
|
i'll dig a bit more
|
[15:59] Guthur
|
the documentation is useless
|
[15:59] Guthur
|
MSDN is crap
|
[16:01] sustrik
|
any examples out there?
|
[16:02] Guthur
|
some stuff, I have more code here, but they do seem to be calling PostQueuedCompletionStatus explicitly
|
[16:02] Guthur
|
they are passing custom overlapped structs with event details
|
[16:03] Guthur
|
I have some server code here, i'll look through that
|
[16:04] sustrik
|
Have you seen this:
|
[16:04] sustrik
|
http://lists.zeromq.org/pipermail/zeromq-dev/2010-September/006240.html
|
[16:06] Guthur
|
I hadn't seen that
|
[16:59] Guthur
|
I think I'm just going to have to throw some code together an experiment
|
[17:23] sustrik
|
maybe discussing it at some windows forum may give you some insight into different technologies
|
[17:23] Guthur
|
sustrik, way ahead. hehe
|
[17:23] Guthur
|
talking to someone on #winapi
|
[17:23] sustrik
|
:)
|
[17:23] Guthur
|
it is indeed possible to get events automatically from pipes via IOCP
|
[17:23] Guthur
|
it's just a little confusing
|
[17:32] sustrik
|
i see
|
[17:36] CIA-21
|
zeromq2: 03Pieter Hintjens 07master * r14a0e14 10/ (src/ctx.cpp src/zmq.cpp):
|
[17:36] CIA-21
|
zeromq2: Fixed win32 issue with WSAStartup
|
[17:36] CIA-21
|
zeromq2: - ctx constructor was calling mailbox_t constructor implicitly
|
[17:36] CIA-21
|
zeromq2: - moved WSAStartup and WSACleanup to be outside constructor/destructor
|
[17:36] CIA-21
|
zeromq2: Signed-off-by: Pieter Hintjens <ph@imatix.com> - http://bit.ly/f693K5
|
[19:01] eut
|
does anyone use the lua zmq bindings? i'm having some trouble with the nonblocking recv
|
[19:02] eut
|
it seems as though i can never receive a message
|
[19:05] cremes
|
eut: what kind of sockets are you using in your test?
|
[19:05] eut
|
xrep/xreq
|
[19:09] eut
|
ah, never mind...
|
[19:09] cremes
|
ok
|
[19:09] eut
|
it looks like zmq buffers outgoing messages, sending several all at once
|
[19:10] eut
|
so sometimes i would quit listening before it finally sent
|
[19:10] cremes
|
i believe it has an internal timer on its I/O thread so that messages are coalesced and sent
|
[19:10] cremes
|
kind of like nagle's algorithm
|
[19:10] eut
|
ok i see
|
[19:11] eut
|
is there a way to influence that internal timer (or whatever)?
|
[19:18] cremes
|
eut: don't know... plus, i may be wrong on that
|
[19:18] cremes
|
ask the mailing list; be sure to include a pointer to your code just in case it's a different issue
|
[19:19] eut
|
ok
|
[19:35] zedas
|
hey can anyone point me at the docs on how 2.1.0 does graceful shutdown? apparently there's a change where sockets will LINGER or not?
|
[19:37] cremes
|
zedas: https://github.com/zeromq/zeromq2/blob/master/doc/zmq_term.txt
|
[19:38] cremes
|
and
|
[19:38] cremes
|
https://github.com/zeromq/zeromq2/blob/master/doc/zmq_setsockopt.txt
|
[19:38] cremes
|
by default, it will "linger" forever until all packets are flushed
|
[19:47] Guthur
|
ah, i think i'm making progress
|
[19:47] Guthur
|
i'll not have the IOCP in ZMQ tonight, but I think i might be able to get it in soon enough
|
[19:48] Guthur
|
I have it working in a small test client server app
|
[20:03] zedas
|
cremes: ok thanks, i've gotta get 2.1.0 working with mongrel2 and this is the only thing that's broken right now.
|