Wt embedded

Version 13 (Koen Deforche, 05/27/2010 07:08 am)

1 1 Pieter Libin
h1. Wt embedded
2 1 Pieter Libin
3 1 Pieter Libin
{{toc}}
4 1 Pieter Libin
5 1 Pieter Libin
Find here information on running Wt in resource constrained embedded systems: performance, code size, memory usage, and other info.
6 1 Pieter Libin
7 1 Pieter Libin
h2. General
8 1 Pieter Libin
9 1 Pieter Libin
Wt can easily be built for and deployed on embedded POSIX systems, such as embedded linux.
10 1 Pieter Libin
11 1 Pieter Libin
h3. Cross-building
12 1 Pieter Libin
13 1 Pieter Libin
Using CMake with a cross compilation environment: to be completed...
14 1 Pieter Libin
15 8 Koen Deforche
Instructions for cross compiling with cmake can be found on the "CMake Wiki":http://www.cmake.org/Wiki/CMake_Cross_Compiling.
16 1 Pieter Libin
17 1 Pieter Libin
h3. Optimizing executable size
18 1 Pieter Libin
19 8 Koen Deforche
Points to consider when optimizing the executable size.
20 1 Pieter Libin
21 8 Koen Deforche
For building boost:
22 8 Koen Deforche
* Use static build of boost, which allows the linker to strip away unused symbols
23 8 Koen Deforche
* Use the following compile flags for boost:
24 8 Koen Deforche
** @-fvisibility=hidden -fvisibility-inlines-hidden@: to avoid exporting symbols in the executable
25 8 Koen Deforche
** @-ffunction-sections -fdata-sections@: to allowing fine-grained garbage collection of unused functions/data
26 8 Koen Deforche
27 8 Koen Deforche
For building Wt:
28 8 Koen Deforche
* Choose build-type @MinSizeRel@
29 8 Koen Deforche
* Extra compile falgs (@CMAKE_CXX_FLAGS@)
30 8 Koen Deforche
** @-fvisibility=hidden -fvisibility-inlines-hidden@: to avoid exporting symbols in the executable
31 8 Koen Deforche
** @-ffunction-sections -fdata-sections@: to allowing finegrained garbage collection of unused functions/data
32 8 Koen Deforche
** @-DHAVE_GNU_REGEX@: to avoid the dependency on libboost_regex, when building on a system that is based on glibc or uClibc
33 8 Koen Deforche
** @-DWT_NO_LAYOUT@: to avoid pulling in the Wt's layout managers, if you are not using any WLayout classes
34 8 Koen Deforche
** @-WT_NO_SPIRIT@: to avoid depending on spirit to parse locale and cookies (if you don't need that)
35 8 Koen Deforche
** @-DWT_NO_XSS_FILTER@: to avoid the extra (runtime) overhead of XSS filtering, usually not relevant for a trusted embedded platform
36 1 Pieter Libin
* Build static libraries (for libwt.a and libwthttp.a)
37 8 Koen Deforche
** in CMake: @SHARED_LIBS:BOOL=OFF@
38 1 Pieter Libin
* Disable build options you don't need and introduce extra dependencies (libz, openssl ?)
39 8 Koen Deforche
* Further tune your linker command:
40 8 Koen Deforche
** Append @-v@ to the linker command used by CMake to see the raw @collect2@ command-line.
41 8 Koen Deforche
** By default, shared/static libraries is all-or-nothing with CMake. However, you probably want to use system-wide versions of libstdc++, libm and libc depending on other applications on your device.
42 8 Koen Deforche
*** Use -Bdynamic in front of libraries you wish to link dynamically against
43 8 Koen Deforche
** There are some other flags that you need to use to make sure the linker does not keep unused symbols:
44 8 Koen Deforche
*** Remove @-export-dynamic@
45 8 Koen Deforche
*** Add @--gc-sections@
46 9 Koen Deforche
* Strip your binary using @strip -s@.
47 1 Pieter Libin
* Optionally, when available for your platform, you may want to compress the size of your binary using the "Ultimate Packer for eXecutables (upx)":http://upx.sourceforge.net/. This typically reduces executable size further by 60-70%, without noticable run-time performance hits.
48 1 Pieter Libin
49 1 Pieter Libin
h3. Measuring performance
50 1 Pieter Libin
51 1 Pieter Libin
To report the run-time performance of Wt on a particular embedded platform, you must connect to the device using a local area connection (through at most one switch), and measure the time between transmission and reception of packets (using a packet sniffer). For the measurements, we use two examples that are included in the Wt distribution: "hello":http://www.webtoolkit.eu/wt/examples/hello/hello.wt (as an example of a minimal application), and "composer":http://www.webtoolkit.eu/wt/examples/composer/composer.wt (as an example of a simple, yet functional, application).
52 1 Pieter Libin
53 1 Pieter Libin
We propose to measure the time to create a new session, and the time of a small event.
54 1 Pieter Libin
55 1 Pieter Libin
56 1 Pieter Libin
h4. Runtime: new session
57 1 Pieter Libin
58 13 Koen Deforche
Wt starts a new session by serving a small page to determines browser capabilities, and then trigger a second call to get the "main page", that has all visible content. To compare the relative performance for a particular platform, you should measure this "load" time, as the total duration of these two requests. You should measure the time from sending the first request, to sending the third request. The third request is either a GET request for auxiliary content (CSS or images), a GET request to a Wt resource, or a POST request to load invisible content in the background.
59 1 Pieter Libin
60 1 Pieter Libin
61 1 Pieter Libin
h4. Runtime: event
62 1 Pieter Libin
63 1 Pieter Libin
We estimate the time needed to process a small event, such as a click on the "Greet me" button in hello, and "Save now" in composer, by measuring the total time for the packet exchange triggered by such an event.
64 1 Pieter Libin
65 1 Pieter Libin
66 1 Pieter Libin
h4. Memory usage: basis
67 1 Pieter Libin
68 1 Pieter Libin
Measuring memory usage is a tricky thing, since code and read-only data memory used by shared libraries is effectively shared between processes, while writable data segments are obviously private to each process.
69 1 Pieter Libin
70 13 Koen Deforche
Therefore, we use @pmap@ to study the memory in different segments. The basis RAM usage is divided between read-only segments, and writable segments. Only the latter are really constrained by physical RAM. We get the total writable size. by summing the size of all writable segments, indicated by pmap with a *w*. The total size reported by pmap and top, minus the size of all writable segments is then the read-only RAM usage. Thus, this number includes shared libraries, and thus overestimates actual RAM usage.
71 1 Pieter Libin
72 1 Pieter Libin
h4. Memory usage: per session
73 1 Pieter Libin
74 1 Pieter Libin
Compare the memory usage after starting 10 sessions with base memory usage, and divide the difference by 10 to estimate the memory used by a single session.
75 1 Pieter Libin
76 1 Pieter Libin
h2. Platforms
77 1 Pieter Libin
78 1 Pieter Libin
h3. ARM926EJ-S
79 1 Pieter Libin
80 1 Pieter Libin
h4. Processor features
81 1 Pieter Libin
82 1 Pieter Libin
* Clock-speed: 200 MHz
83 1 Pieter Libin
* Linux BogoMIPS: 89.70
84 1 Pieter Libin
* Caches: 8K instruction, 8K data
85 1 Pieter Libin
86 8 Koen Deforche
Configurations are ordered chronically, latest first.
87 1 Pieter Libin
88 8 Koen Deforche
h4. Config 2: minimal (16/03/2010)
89 1 Pieter Libin
90 8 Koen Deforche
h5. Setup
91 8 Koen Deforche
92 8 Koen Deforche
* *Wt version:* git (16/03/2010, >= Wt 3.1.1)
93 8 Koen Deforche
* *Target system:* Linux uclibc 2.6.23
94 8 Koen Deforche
* *Build environment:* buildroot, arm-linux-gcc 4.2.1
95 8 Koen Deforche
* *Options:* without multi-threading, libz and OpenSSL
96 11 Koen Deforche
* *Build type:* full static build, except for: libstdc++, libc, and libm
97 8 Koen Deforche
* *Runtime settings:* ./app.wt --docroot . --http-address 0.0.0.0 --no-compression
98 8 Koen Deforche
99 8 Koen Deforche
h5. Performance results
100 8 Koen Deforche
101 8 Koen Deforche
*Code size and RAM usage (in KBytes)*
102 8 Koen Deforche
|_.Program|_.Code size (strip)|_.Code size (strip + upx)|_.RAM: basis † (read-only)|_.RAM: basis (writable)|_.RAM: per session|
103 9 Koen Deforche
| hello| 1.214  | 362 | 2.544 | 228 | 14.8 |
104 9 Koen Deforche
| composer| 1.462  | 420 | 2.796 | 232 | 83.6 |
105 8 Koen Deforche
106 8 Koen Deforche
† includes shared libraries !
107 8 Koen Deforche
108 8 Koen Deforche
*Runtime-performance*
109 8 Koen Deforche
|_.Program  |_.New session (http) |_.Event (http)|
110 9 Koen Deforche
| hello | 0.26 s | 0.07 s |
111 8 Koen Deforche
|composer| 0.69 s | 0.08 s |
112 8 Koen Deforche
113 8 Koen Deforche
h4. Config 1: minimal (18/03/2008)
114 1 Pieter Libin
115 1 Pieter Libin
h5. Setup
116 1 Pieter Libin
117 1 Pieter Libin
* *Wt version:* CVS-snapshot 18/03/08
118 1 Pieter Libin
* *Target system:* Linux uclibc 2.6.23
119 1 Pieter Libin
* *Build environment:* buildroot, arm-linux-gcc 4.2.1
120 1 Pieter Libin
* *Options:* with multi-threading, but without libz and OpenSSL
121 1 Pieter Libin
* *Build type:* full static build, except for: libc, libpthread, libdl, libstdc++, and libm
122 1 Pieter Libin
* *Build settings:* MinSizeRel, -DHAVE_GNU_REGEX
123 1 Pieter Libin
* *Runtime settings:* ./app.wt --docroot . --http-address 0.0.0.0 --threads=2 --no-compression
124 1 Pieter Libin
125 1 Pieter Libin
h5. Performance results
126 3 Pieter Libin
127 4 Pieter Libin
*Code size and RAM usage (in KBytes)*
128 6 Pieter Libin
|_.Program|_.Code size (strip)|_.Code size (strip + upx)|_.RAM: basis † (read-only)|_.RAM: basis (writable)|_.RAM: per session|
129 5 Pieter Libin
| hello| 1.130  | 304 | 2.580 | 372 | 28|
130 5 Pieter Libin
| composer| 1.265  | 332 | 2.712 | 372 | 126|
131 1 Pieter Libin
132 3 Pieter Libin
† includes shared libraries !
133 1 Pieter Libin
134 5 Pieter Libin
*Runtime-performance*
135 7 Pieter Libin
|_.Program  |_.New session (http) |_.Event (http)|
136 5 Pieter Libin
| hello | 0.58 s | 0.15 s |
137 5 Pieter Libin
|composer| 1.8 s | 0.15 s |