Project

General

Profile

Bug #6367

WIOService Freeze under hevy load?

Added by Drus Kor over 3 years ago. Updated 10 months ago.

Status:
Closed
Priority:
High
Assignee:
-
Target version:
Start date:
05/08/2018
Due date:
% Done:

0%

Estimated time:

Description

After several minutes of load test:

- Wt didn't answer anything by http.

- Telnet can connect to port. But no any answer.

- All threads work in asio::io_service::run();

  • No one funcrion of WIOService be call

it is simply to repeat.

Where need look?

What kind of info need to debug?


Files

trunk.tar.gz (2.54 KB) trunk.tar.gz Qt web test Drus Kor, 06/13/2018 01:50 PM
#1

Updated by Wim Dumon over 3 years ago

Hello Drus,

That's not normal in any case.

What version of boost and what operating system? Are you using prebuilt binaries?

For debugging, stack traces of all threads at the moment of the lockup would be the first place to get an idea of what is going on. Often it gives an idea on what your application is stuck.

Best regards,

Wim.

#2

Updated by Drus Kor over 3 years ago

Hello!

I'm testing this case on kubuntu 16,17,18 with default boost library.

freeze only with

app->enable Updates();

std::unique_ptr<WApplication> createApplication( const WEnvironment& env )
{
std::unique_ptr<WApplication> app = std::make_unique<WApplication>( env );
app->enableUpdates();
return app;
}

on 8-core cpu run 100 active browser tabs with active javascript and load random page in while loop.

I'll post debugging later. thx.

#3

Updated by Drus Kor over 3 years ago

9 threads stay under

@

1 futex_wait_cancelable futex-internal.h 88 0x7ffff54d1072

2 __pthread_cond_wait_common pthread_cond_wait.c 502 0x7ffff54d1072

3 __pthread_cond_wait pthread_cond_wait.c 655 0x7ffff54d1072

4 boost::asio::detail::posix_event::wait<boost::asio::detail::scoped_lockboost::asio::detail::posix_mutex> posix_event.hpp 106 0x7ffff76efeb8

5 boost::asio::detail::task_io_service::do_run_one task_io_service.ipp 380 0x7ffff76efeb8

6 boost::asio::detail::task_io_service::run task_io_service.ipp 149 0x7ffff76efeb8

7 boost::asio::io_service::run io_service.ipp 59 0x7ffff76efeb8

8 Wt::WIOService::run WIOService.C 180 0x7ffff76efeb8

9 ?? 0x7ffff59b50ff

10 start_thread pthread_create.c 465 0x7ffff54ca7fc

11 clone clone.S 95 0x7ffff51f7b5f

@

one

1 __GI_epoll_pwait epoll_pwait.c 42 0x7ffff51f7cd7
2 epoll_wait epoll_wait.c 30 0x7ffff51f7e38
3 boost::asio::detail::epoll_reactor::run epoll_reactor.ipp 392 0x7ffff76f33af
4 boost::asio::detail::task_io_service::do_run_one task_io_service.ipp 356 0x7ffff76efc51
5 boost::asio::detail::task_io_service::run task_io_service.ipp 149 0x7ffff76efc51
6 boost::asio::io_service::run io_service.ipp 59 0x7ffff76efc51
7 Wt::WIOService::run WIOService.C 180 0x7ffff76efc51
8 ?? 0x7ffff59b50ff
9 start_thread pthread_create.c 465 0x7ffff54ca7fc
10 clone clone.S 95 0x7ffff51f7b5f

any of threads enter in above code

after that log contains only [post]

127.0.0.1 - - [2018-May-10 12:20:51.809] "POST /?wtd=f2ySCWIdDI0Pc4Fx HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.809] "POST /?wtd=ffkbTBDj4om61eKZ HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.809] "POST /?wtd=g0yaLs65c6QHff8s HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.810] "POST /?wtd=gBNbqQgHcSztPHhC HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.810] "POST /?wtd=goT08mYPplrQbE6t HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.810] "POST /?wtd=hUNOTOwdyg8CDZ8h HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.810] "POST /?wtd=ibq11KgZp8laZzBZ HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.811] "POST /?wtd=idJEKtis0sH6onWR HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.811] "POST /?wtd=imTj2eGEDZfjsF7O HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.811] "POST /?wtd=ipe5xMxyKmGAGzk6 HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.811] "POST /?wtd=ngKr8xHgs8qbGkDv HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.812] "POST /?wtd=owFPgDnjytRFsGbr HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.812] "POST /?wtd=rOMGKU6owgBJVgUB HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.812] "POST /?wtd=qmN5EaZtj0fsVP17 HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.813] "POST /?wtd=sHLWSHCQksStOBkj HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.813] "POST /?wtd=rSJCyf8fYpmZWoM1 HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.813] "POST /?wtd=tJiPxV9r0BoEiBtD HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.813] "POST /?wtd=tnlK5WbhWhKsCfAK HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.814] "POST /?wtd=umIk9DiiRcE6ykbs HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.814] "POST /?wtd=un1qyHZrcpCieyNi HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.814] "POST /?wtd=vR2laUqRoJ3Y82IS HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.814] "POST /?wtd=wLFtNdCecq8Ycid9 HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.815] "POST /?wtd=yWybxiatbaYiPzWG HTTP/1.1" 200 0
127.0.0.1 - - [2018-May-10 12:20:51.815] "POST /?wtd=zCwdfAhG0KUbz1V1 HTTP/1.1" 200 0

seems need debug build.

It seems that debugging is needed.

Hope I'm doing it in the evening.

#4

Updated by Drus Kor over 3 years ago

Normal work trace

@

1 Wt::WIOService::schedule(std::chrono::duration<long, std::ratio<1l, 1000000000l>>, std::function<void ()> const&) WIOService.C 133 0x7ffff7270134

2 Wt::WServer::schedule(std::chrono::duration<long, std::ratio<1l, 1000000000l>>, std::string const&, std::function<void ()> const&, std::function<void ()> const&) WServer.C 225 0x7ffff7362d00

3 Wt::WebSession::handleRequest WebSession.C 1575 0x7ffff7761de1

4 Wt::WebController::handleRequest WebController.C 756 0x7ffff77414bd

5 std::__invoke_impl<void, void (Wt::WebController:: &)(Wt::WebRequest), Wt::WebController &, http::server::HTTPRequest&> invoke.h 73 0x7ffff6078e82

6 std::__invoke<void (Wt::WebController:: &)(Wt::WebRequest), Wt::WebController &, http::server::HTTPRequest&> invoke.h 95 0x7ffff6078cf0

7 std::Bind<void (Wt::WebController:: *)(Wt::WebRequest)>::_call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>) functional 467 0x7ffff6078b6c

8 std::_Bind<void (Wt::WebController:: *
)(Wt::WebRequest
)>::operator()<, void>() functional 551 0x7ffff60789ed

9 boost::asio::asio_handler_invoke<std::_Bind<void (Wt::WebController:: )(Wt::WebRequest)>>(std::_Bind<void (Wt::WebController:: )(Wt::WebRequest)>&, ...) handler_invoke_hook.hpp 69 0x7ffff607887d

10 boost_asio_handler_invoke_helpers::invoke<std::_Bind<void (Wt::WebController:: )(Wt::WebRequest)>, std::_Bind<void (Wt::WebController:: )(Wt::WebRequest)>>(std::_Bind<void (Wt::WebController:: )(Wt::WebRequest)>&, std::_Bind<void (Wt::WebController:: )(Wt::WebRequest)>&) handler_invoke_helpers.hpp 37 0x7ffff60786ed

11 boost::asio::detail::completion_handler<std::_Bind<void (Wt::WebController:: )(Wt::WebRequest)>>::do_complete(boost::asio::detail::task_io_service , boost::asio::detail::task_io_service_operation, boost::system::error_code const&, unsigned long) completion_handler.hpp 68 0x7ffff60780a0

12 boost::asio::detail::task_io_service_operation::complete task_io_service_operation.hpp 38 0x7ffff7271b9a

13 boost::asio::detail::task_io_service::do_run_one task_io_service.ipp 372 0x7ffff7273d05

14 boost::asio::detail::task_io_service::run task_io_service.ipp 149 0x7ffff7273729

15 boost::asio::io_service::run io_service.ipp 59 0x7ffff7274014

16 Wt::WIOService::run WIOService.C 180 0x7ffff727059b

17 std::__invoke_impl<void, void (Wt::WIOService:: &)(), Wt::WIOService&> invoke.h 73 0x7ffff7279f3f

18 std::__invoke<void (Wt::WIOService:: &)(), Wt::WIOService&> invoke.h 95 0x7ffff727969a

19 std::Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>::_call<void, , 0ul>(std::tuple<>&&, std::_Index_tuple<0ul>) functional 467 0x7ffff727865f

20 std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>::operator()<, void>() functional 551 0x7ffff727799b

...

@

after break

1 boost::asio::detail::task_io_service_operation::complete task_io_service_operation.hpp 38 0x7ffff7271b80
2 boost::asio::detail::task_io_service::do_run_one task_io_service.ipp 372 0x7ffff7273d05
3 boost::asio::detail::task_io_service::run task_io_service.ipp 149 0x7ffff7273729
4 boost::asio::io_service::run io_service.ipp 59 0x7ffff7274014
5 Wt::WIOService::run WIOService.C 180 0x7ffff727059b
6 std::__invoke_impl<void, void (Wt::WIOService:: *&)(), Wt::WIOService *&> invoke.h 73 0x7ffff7279f3f
7 std::__invoke<void (Wt::WIOService:: *&)(), Wt::WIOService *&> invoke.h 95 0x7ffff727969a
8 std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>::__call<void, , 0ul>(std::tuple<>&&, std::_Index_tuple<0ul>) functional 467 0x7ffff727865f
9 std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>::operator()<, void>() functional 551 0x7ffff727799b
10 std::__invoke_impl<void, std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>>(std::__invoke_other, std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>&&) invoke.h 60 0x7ffff7276aa7
11 std::__invoke<std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>>(std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>&&) invoke.h 95 0x7ffff7275d9f
12 std::thread::_Invoker<std::tuple<std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>>>::_M_invoke<0ul>(std::_Index_tuple<0ul>) thread 234 0x7ffff727d0de
13 std::thread::_Invoker<std::tuple<std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>>>::operator()() thread 243 0x7ffff727cb6c
14 std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::_Bind<void (Wt::WIOService:: *(Wt::WIOService *))()>>>>::_M_run() thread 186 0x7ffff727c730
15 ?? 0x7ffff43ff0ff
16 start_thread pthread_create.c 465 0x7ffff3f147fc
17 clone clone.S 95 0x7ffff3c41b5f

No enter in function

11 boost::asio::detail::completion_handler<std::_Bind<void (Wt::WebController:: *(Wt::WebController *, http::server::HTTPRequest *))(Wt::WebRequest *)>>::do_complete(boost::asio::detail::task_io_service *, boost::asio::detail::task_io_service_operation *, boost::system::error_code const&, unsigned long) completion_handler.hpp 68 0x7ffff60780a0

#5

Updated by Roel Standaert over 3 years ago

  • Status changed from New to Feedback

Could you possibly share the test environment you're using? Having over 100 tabs open seems to be working alright for me after I increased the maximum connections per server limit of my browser (note that I'm using Boost 1.67 on Ubuntu 16.04, though).

#6

Updated by Drus Kor over 3 years ago

Hello!

I'm testing app by Qt.

Internal path value don't have any meaning.

May be important redirect wApp output to /dev/null.

Today i have some troubles to reproduce bug. Start over 20+ test app ( not from Ide ).

Checking work status by browser tab.

#7

Updated by Roel Standaert over 3 years ago

Are you saying that right now you can't reproduce it either? Because I can't either. I was testing hello.wt and at some point it had over 12000 sessions, and it was still working properly.

#8

Updated by Drus Kor over 3 years ago

Think problem is not in connections count. It is in dynamic create/delete sessions. Try sending code to detect "lost response bug".

Bug reproduced, sorry my eng.skill

#9

Updated by Roel Standaert about 3 years ago

  • Target version changed from 4.0.4 to 4.0.5
#10

Updated by Roel Standaert almost 3 years ago

  • Target version changed from 4.0.5 to 4.1.0
#11

Updated by Wim Dumon 11 months ago

  • Description updated (diff)
  • Status changed from Feedback to Confirmed
  • Target version changed from 4.1.0 to 4.5.0

Hello Drus,

We could reproduce this, a fix will be in git soon. Your stack trace and description matches with the problem we identified.

BR,
Wim.

#12

Updated by Roel Standaert 11 months ago

  • Status changed from Confirmed to Implemented @Emweb
#13

Updated by Roel Standaert 11 months ago

  • Status changed from Implemented @Emweb to Resolved
#14

Updated by Roel Standaert 10 months ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF