aboutsummaryrefslogtreecommitdiffstats
path: root/lib/bb/server
AgeCommit message (Collapse)Author
2017-08-21process: Ensure we call select() to know which fds to readRichard Purdie
There is an interesting bug in the current code where a sync command is not seen until the current async command completes, by which time the UI may have shut down. The reason is that if there are idle commands, we may not end up sleeping in the select call at all, partiularly under heavy load like parsing. Fix this by calling select with a zero timeout so that we see active fds and know to read from them. This fixes various problems toaster was having with the recent server changes. [YOCTO #11898] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-15process: Increase server startup timeoutRichard Purdie
We're seeing the server fail to start within 8s on heavily loaded autobuilders so increase this timeout to 30s which should be more than enough time. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-14process: Improve client disconnectsRichard Purdie
There have been cases where the server could loop indefinitely and incorrectly handle client disconnects. In the EOFError case, ensure a full disconnect happens in the alternative disconnect path to avoid this. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-09server/process: Ensure we don't loop on client EOFErrorRichard Purdie
The server currently crashes if we hit an EOFError due to controllersock still being in ready and the continue meaning ready isn't re-evaluated. Setting the value to False can mean the shutdown code doesn't handle the situation cleanly. Clear ready to avoid the crash/loop instead and handle any OSError whilst we're in here. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-08main: Handle BB_SERVER_TIMEOUT = -1 for no server timeoutRobert Yang
Make BB_SERVER_TIMEOUT = -1 mean no unload forever. Signed-off-by: Robert Yang <liezhi.yang@windriver.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-08-08process: Fix disconnect when BB_SERVER_TIMEOUTRobert Yang
Fixed: $ export BB_SERVER_TIMEOUT=10000 $ bitbake --server-only $ bitbake --status-only [snip] File "/buildarea/lyang1/poky/bitbake/lib/bb/server/process.py", line 472, in recvfds msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_LEN(bytes_size)) OSError: [Errno 9] Bad file descriptor And: $ export BB_SERVER_TIMEOUT=10000 $ bitbake --server-only -B localhost:-1 $ bitbake --status-only # Everything is fine in first run $ bitbake --status-only [snip] File "/buildarea/lyang1/poky/bitbake/lib/bb/server/process.py", line 472, in recvfds msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_LEN(bytes_size)) OSError: [Errno 9] Bad file descriptor This was because self.controllersock was not set to False, so it still ran sock.recvmsg() when sock was closed. And also need set command_channel to Flase, otherwise the self.command_channel.get() will always run when EOF, and cause infinite loop. Signed-off-by: Robert Yang <liezhi.yang@windriver.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-31process: Add some extra server startup logsRichard Purdie
We have cases where the server is being started but we're not seeing any messages from it. Add some earlier logging so we can try and better understand where issues may be occurring. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-31process: Reorder server command processing and handle EOFErrorRichard Purdie
If the connection control socket and the command channel close together, we can race and hit EOFError exceptions before we close the channel. Reorder the code to handle this in the correct order and ignore the EOFError exceptions as they mean the client is disconnecting and shouldn't terminate the server. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Clean up server communication timeout errorsRichard Purdie
This timeout path was commonly hit due to errors starting the server. Now we have a better way to handle that, the retry logic can be improved and cleaned up. This patch: * Makes the timeout 5s rather than intervals of 1s with a message. Paul noted some commands can take around 1s to run on a server which has just been started on a loaded system. * Allows a broke connection to exit immediately rather than retrying something which will never work. * Drops the Ctrl+C masking, we shouldn't need that anymore and any issues would be better handled in other ways. This should make things clearer and less confusing for users and is much cleaner code too. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Don't leak open pipes upon reconnectionRichard Purdie
If we reconnect to the server, stop leaking pipes and clean up after ourselves. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process/cooker: Allow UI process to know if the cooker was started successfullyRichard Purdie
Currently if the server fails to start, the user sees no error message and the server will be repeatedly attempted to be started until some longer timeouts expire. There are error messages in the cookerdeamon log but nobody thinks to look there. Add in a pipe which can be used to tell the starting process whether the cooker did actually start or not. If it fails to start, no further attempts can be made and if present, the log file can be shown to the user. [YOCTO #11834] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Move socket keep alive into BitBakeProcessServerConnectionRichard Purdie
This cleans up the socket keep alive into better class structured code and adds cleanup of the open file descriptors upon shutdown. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Allow BBUIEventQueue to exit cleanlyRichard Purdie
Currently the monitoring thread exits with some error code or runs indefinitely. Allow closure of the pipe its monitoring to have the thread exit cleanly/silently. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-28process: Ensure ConnectionReader/Writer have fileno() and close() methodsRichard Purdie
Expose the underlying close() and fileno() methods which allow connection monitoring and cleanup. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-24process: Change timeout warning to a noteRichard Purdie
The warning message currently shown can occur more frequently than previously if a previous bitbake server is shutting down and we're reconnecting to a new server. Change it to a note message to match the higher level connection logging retry messages and so as not to interfer with selftests. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-24cooker/process: Drop server_main functionRichard Purdie
Now that there is only one server, this abstraction is no longer needed and causes indrection/confusion. The server shutdown is also broken with the cooker post_server calls happening too late, leading to "lock held" warnings in the logs if PRServ is enabled. Remove the abstraction and put the shutdown calls in the right order with respect to the locking. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-21server/process.py: fix self.bitbake_lock.write()Robert Yang
There is no global var "configuration", so the old code hang at self.bitbake_lock.write(), and nothing wrote to bitbake.lock. I didn't figure out why it hang (but not print errors). Reproducer: $ bitbake -B localhost:-1 world -k Check bitbake.log, there was nothing, now fixed. Signed-off-by: Robert Yang <liezhi.yang@windriver.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-21server: Rework the server API so process and xmlrpc servers coexistRichard Purdie
This changes the way bitbake server works quite radically. Now, the server is always a process based server with the option of starting an XMLRPC listener on a specific inferface/port. Behind the scenes this is done with a "bitbake.sock" file alongside the bitbake.lock file. If we can obtain the lock, we know we need to start a server. The server always listens on the socket and UIs can then connect to this. UIs connect by sending a set of three file descriptors over the domain socket, one for sending commands, one for receiving command results and the other for receiving events. These changes meant we can throw away all the horrid server abstraction code, the plugable transport option to bitbake and the code becomes much more readable and debuggable. It also likely removes a ton of ways you could hang the UI/cooker in weird ways due to all the race conditions that existed with previous processes. Changes: * The foreground option for bitbake-server was dropped. Just tail the log if you really want this, the codepaths were complicated enough without adding one for this. * BBSERVER="autodetect" was dropped. The server will autostart and autoconnect in process mode. You have to specify an xmlrpc server address since that can't be autodetected. I can't see a use case for autodetect now. * The transport/servetype option to bitbake was dropped. * A BB_SERVER_TIMEOUT variable is added which allows the server to stay resident for a period of time after the last client disconnects before unloading. This is used if the -T/--idle-timeout option is not passed to bitbake. This change is invasive and may well introduce new issues however I believe the codebase is in a much better position for further development and debugging. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-18server: Remove base classes and inline codeRichard Purdie
In preparation for rewriting this code, expand the relatively useless base classes into the code itself. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-18event/command: Allow UI to request the UI eventhander IDRichard Purdie
The UI may want to change its event mask however to do this, it needs the event handler's ID. Tweak the code to allow this to be stored and add a command to query it. Use the new command in the process server backend. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-18bb/server/process: Handle EINTR on idle_commands selectAníbal Limón
If a signal is sent like SIGWINCH the select could be interrupted so ignore the InterruptError like in XMLRPC server [1]. [1] http://git.yoctoproject.org/cgit/cgit.cgi/poky/tree/bitbake/lib/bb/server/xmlrpc.py#n307 Signed-off-by: Aníbal Limón <anibal.limon@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-07server/xmlrpc: Add Heartbeat event supportRichard Purdie
When heartbeat event support was added it was only added to process.py. Add it to server/xmlrpc too. There is duplicated code however since we're likely to combine the server abstractions soon its not worth worrying about now. This ensures the backends have the same event support. [YOCTO #10741] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-07event: Queue offline events for the UIRichard Purdie
Messages printed when no UI is connected (e.g. memres) are currently lost. Use the existing queue mechanism to queue these until a UI attaches, then replay them. This isn't ideal but better than the current situation of losing them entirely. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-07-07server/process: Fix waitEvent() calls with 0 timeoutRichard Purdie
You might think Queue.Queue.get(True, 0) would return an event immediately if present and otherwise return. It doesn't, it immediately "times out" and returns with nothing from the queue. The behaviour we want is not to wait but return anything present which is what .get(False) does so map to this. This fixes some odd behaviour observed in some of the tinfoil selftests. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2017-01-23bb/server/process.py: ProcessEventQueue add close of _writer pipeAníbal Limón
Call explicity close in _writer to avoid fd leakage because isn't called on Queue.close() [YOCTO #10873] Signed-off-by: Aníbal Limón <anibal.limon@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-12-16bitbake: remove True option to getVar calls (take 2)Joshua Lock
getVar() now defaults to expanding by default, thus remove the True option from getVar() calls with a regex search and replace. Search made with the following regex: getVar ?\(( ?[^,()]*), True\) (a follow on patch to fix up a few recent introductions) Signed-off-by: Joshua Lock <joshua.g.lock@intel.com> Signed-off-by: Ross Burton <ross.burton@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-12-14server/process: don't change UI process signal handler on terminatePaul Eggleton
On terminating the connection to the server, we were disabling SIGINT - and this is executed on the UI side. I'm not sure whether the intention here was to undo the SIGINT disabling we did in the server, and it was just a mistake that it disabled rather than restored and it's run on the wrong side, or whether we wanted to stop the user from breaking out of the shutdown code - the commit message provides no clues either way. Regardless, we do not want to permanently disable Ctrl+C here - it's legitimate to terminate the connection to the server and then re-establish it within the same process; at least currently, devtool modify by virtue of using tinfoil in two separate parts of the code does this, and the result of this disabling is that during the second tinfoil usage we can potentially be parsing all recipes without the ability to easily interrupt the process. Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-12-14server/xmlrpc: send back 503 response with correct encodingPaul Eggleton
If you send back a string here you get "TypeError: 'str' does not support the buffer interface" errors in bitbake-cookerdaemon.log and "IncompleteRead(0 bytes read, 22 more expected)" errors on the client side. Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-12-07cooker process: fire heartbeat event at regular time intervalsPatrick Ohly
The intended usage is for recording current system statistics from /proc in buildstats.bbclass during a build and for improving the BB_DISKMON_DIRS implementation. All other existing hooks are less suitable because they trigger at unpredictable rates: too often can be handled by doing rate-limiting in the event handler, but not often enough (for example, when there is only one long-running task) cannot because the handler does not get called at all. The implementation of the new heartbeat event hooks into the cooker process event queue. The process already wakes up every 0.1s, which is often enough for the intentionally coarse 1s delay between heartbeats. That value was chosen to keep the overhead low while still being frequent enough for the intended usage. If necessary, BB_HEARTBEAT_EVENT can be set to a float specifying the delay in seconds between these heartbeat events. Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-07-21bitbake: implement idle timeout for xmlrpc serverEd Bartosh
Idle timeout can be specified either by -T/--idle-timeout option or by sessing BBTIMEOUT environment variable. Bitbake xmlrpc server will unload itself when timeout exprired, i.e. when server is idle for more than <idle timeout> seconds. [YOCTO #5534] Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-07-20bitbake: xmlrpc: implement check of connection to serverEd Bartosh
Implemented check_connection function. The purpose of this function is to check if bitbake server is accessible and functional. To check this this function tries to connect to bitbake server and run getVariable command. This API is going to be used to implement autoloading of bitbake server. [YOCTO #5534] Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-06-01xmlrpc: add parameter use_builtin_typesEd Bartosh
Added use_builtin_types parameter to XMLRPCProxyServer.__init__ to fix this error: ERROR: Could not connect to server 0.0.0.0:37132 : __init__() got an unexpected keyword argument 'use_builtin_types' Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-06-01bitbake: Convert to python 3Richard Purdie
Various misc changes to convert bitbake to python3 which don't warrant separation into separate commits. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-05-13server/process: Fix missing log messages issueRichard Purdie
Currently if the server dies, its possible that log messages are never displayed which is particularly problematic if one of those messages is the exception and backtrace the server died with. Rather than having the event queue exit as soon as the server disappears, we should pop events from the queue until its empty before exiting. This patch tweaks that code so that even if the server is dead and we're going to exit, we return any events left in the pipe. This makes debugging certain failures much easier. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-03-24bitbake: xmlrpc: set single use mode differentlyEd Bartosh
Currently xmlrpc server implicitly sets itself into single use mode when bitbake server is started with anonymous port (0) or no port is provided in command line. In this mode bitbake shuts down xmlrpc server after build is done. This assumption is incorrect in some cases. For example Toaster uses bitbake in this mode and expects xmlrpc server to stay in memory. Till recent changes single use mode was always unset due to the bug. When the bug was fixed it broke toaster builds as Toaster couldn't communicate with bitbake server in single use mode. Reimplemented logic of setting single use mode. The mode is explicity set when --server-only command line parameter is not provided to bitbake. It doesn't depend on the port number anymore. [YOCTO #9275] [YOCTO #9240] [YOCTO #9252] Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com> Signed-off-by: Elliot Smith <elliot.smith@intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-03-09server/process: Try connecting 4 times before giving upLucas Dutra Nunes
Instead of trying one time with a timeout of 20 seconds try 4 times with a timeout of 5 seconds, to account for a slow server start. Signed-off-by: Lucas Dutra Nunes <ldnunes@ossystems.com.br> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-03-09xmlrpc: fix bug in setting XMLRPCServer.single_useEd Bartosh
XMLRPCServer.single_use attribute was always set to False. This caused xmlrpc server to keep running after build is done as BitBakeServerCommands.removeClient only shuts down server if its single_use attribute is set to True. Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-01-29bitbake: Set process names to be meaninfulRichard Purdie
This means that when you view the process tree, the processes have meaningful names, aiding debugging: $ pstree -p 30021 bash(30021)───KnottyUI(115579)───Cooker(115590)─┬─PRServ(115592)───{PRServ Handler}(115593) ├─Worker(115630)───bash:sleep(115631)───run.do_sleep.11(115633)───sleep(115634) └─{ProcessEQueue}(115591) $ pstree -p 30021 bash(30021)───KnottyUI(117319)───Cooker(117330)─┬─Cooker(117335) ├─PRServ(117332)───{PRServ Handler}(117333) ├─Parser-1:2(117336) └─{ProcessEQueue}(117331) Applies to parse threads, PR Server, cooker, the workers and execution threads, working within the 16 character limit as best we can. Needed to tweak the bitbake-worker magic values to tell the workers apart. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-01-05xmplrpc: split connect methodEd Bartosh
Current code in connect method sets up event queue, which requires registering UI handler. This functionality may not be needed for some operations, e.g. for server termination. Moved functionality of setting up event queue in from 'connect' method to 'setupEventQueue' in BitBakeXMLRPCServerConnection class. Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2016-01-05uievent: add error to registerEventHandler returnEd Bartosh
Current code throws Exception("Could not register UI event handler") if event handler can't be registered. The real reason of this is that cooker is in busy state. Error message lacks information about this. Added error message to the return value of registerEventHandler. Included returned error message into the log message and exception text. Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-09-08server/process: Handle SIGTERM more gracefullyRichard Purdie
Currently if you send a SIGTERM to the bitbake UI process, the system basically hangs if tasks are executing. This is because the server process doesn't actually try any kind of shutdown before exiting. This patch trys executing a stateForceShutdown command first, which is enough to stop any active tasks before the system exits. I also noticed that terminate can execute multiple times, once at SIGTERM from the handler and once from the real exit. Double execution leads to stack traces and potential hangs (writes to dead pipes), so ensure the code only can run once. With these fixes, bitbake much more correctly deals with SIGTERM to the UI process. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-09-03event/server: Add _uiready flag to handle missing error messagesRichard Purdie
If you start and suspend a bitbake execution so the bitbake lock is held, then try and run "bitbake -w '' X", you will see bitbake return an error exit code but print no message about what happened at all. The reason is that the -w option creates a "UI" which swallows the messages. The code which handles this exit failure mode thinks a UI has printed the messages and therefore doesn't do so. This adds in an extra parameter to the UI registration code so that we can figure out whether its a primary UI or not and base decisions on whether to display information on that instead. This fixes the error shown above and some bizarre failures on the Yocto Project Autobuilder. [YOCTO #8239] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-08-17Fix default function parameter assignment to a listPaul Eggleton
With python you should not assign a list as the default value of a function parameter - because a list is mutable, the result will be that the first time a value is passed it will actually modify the default. Reference: http://docs.python-guide.org/en/latest/writing/gotchas/#mutable-default-arguments Signed-off-by: Paul Eggleton <paul.eggleton@linux.intel.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-06-19server/process: Don't log BBHandledExceptionRichard Purdie
If we see a BBHandledException in the idle handler, the understanding is the system handled it, printing a log and traceback is just confusing. Therefore only print these in the cases where its an unknown/unhandled exception. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-03-10cooker/server: Fix up 100% CPU usage at idleRichard Purdie
The recent inotify changes are causing a 100% cpu usage issue in the idle handlers. To avoid this, we update the idle functions to optionally report a float value which is the delay before the function needs to be called again. 1 second is fine for the inotify handler, in reality its more like 0.1s due to the default idle function sleep. This reverts performance regressions of 1.5 minutes on a kernel build and ~5-6 minutes on a image from scratch. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2015-03-09xmlrpc server: delete function on errorsAlexandru DAMIAN
This patch makes sure to delete an idle function that raises an exception for the xmlrpc server. The counterpart functionality in the process server was added with: commit db50630948394bdcd361f3511af40c1896b1a017. duthor: Richard Purdie <richard.purdie@linuxfoundation.org> Date: Wed Aug 20 22:31:06 2014 +0000 bitbake: process: Deal with infinite looping of the server This patch fixes [YOCTO #7316] Signed-off-by: Alexandru DAMIAN <alexandru.damian@intel.com>
2015-03-09xmlrpcserver: do not connect client on errorAlexandru DAMIAN
We roll back the client connection if some error happens, like during setFeatures, as to leave the server accessible to other clients. Signed-off-by: Alexandru DAMIAN <alexandru.damian@intel.com>
2015-01-23server/process: Fix select callRichard Purdie
There was a report that bitbake -e | less would use 100% cpu when it shouldn't really. The issue appears to be a bogus file descriptor in the select call. We shouldn't be blocking if there is event data pending to a *reader* from server context. [YOCTO #7138] Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-09-02process: Ensure abnormal exits set an error levelRichard Purdie
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
2014-08-22process: Further improve robustness against server shutdownRichard Purdie
Currently, if an exception occurs in an event handler, the server shuts down but the UI simply hangs. This happens in two places, firstly waiting for events and secondly, sending events to a server which no longer exists. The latter does time out, the former does not. These patches improve both code sections to check if the main server process is alive and if not, trigger things to shut down gracefully. This avoids the timeout in the command sending case too. This resolves various cases where the UI would simply hang indefintely. Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>