Keywords

1 Introduction

Since the beginning of the mobile phone’s diffusion, the use of the smartphone as an user interface for controlling the surrounding devices and environment, has been the topic of many research studies [1] and the possible interactions become particularly interesting when they involve a large amount of people at the same time. Since its largescale adoption, the smartphone has become one of the favorite user interface for public events. [26]

A recent trend is to take advantage of the incredible popularity of the mobile devices, to provide the audience a way to take part at collaborative live events. The mass audience participation in live performances has led to the development of audience participation software frameworks like massMobile, a flexible clientserver system designed to be adapted to various performance needs (and performance venues) by allowing a wireless real time bidirectional communication between performers and audiences. [7]

Other studies concentrate on using the crowd smartphones as UI in the field of collaborative gaming. An example is Space Bugz!, a crowd game for large venues or cinemas that uses an Android app to transform the audience’s smartphones in controllers for the game. [8]

2 Proposal

The proposed system aims to involve users in an interactive multimedia experience. For a 360° involvement, participants must enter in a cylinder made by a tent of LEDs lights, 12mt diameter and 6mt high. Entering in the cylinder, people dive in a rich performance of sound and light stimulus. Participants have been grouped in 6 at time and one last generation smartphone has been given to each users. These users had to use a definite application of the smartphone in order to interact in real time with the exhibition, modifying colored lights and percussion sounds. Each phone sent information about its own ID and the user’s actions. The smartphones and server has been connected using WiFi network. The server linked the incoming information with specific variables in the interactive firmware. Each device was correlated with a specific sound and light effect.

Finally, when people danced naturally inside the cylinder, shaking and touching the phone, the visual animation and music changed continuously.

3 Architecture of the System

Entire system was composed by 6 mobile phones Nokia Lumia 920, a WiFi antenna, a server Apple Mac G5, a sound system (NI Traktor Kontrol S4MK2 MiDi controller, mixer, amplifier and speaker) and a Barco MiSPHERE tent disposed in a 12mt diameter and 6mt high cylindrical shape (Fig. 1).

Fig. 1.
figure 1

Architecture of the system

3.1 Server Application

The server is the part of the system in charge to collect, filter and elaborate the data sent by the smartphones and reflect the appropriate changes to the video and to the audio of the performance.

It’s coded in Java programming language and the GUI it’s composed by two Java Applet windows. One of them is the control interface that allows to change the running sketch and to check the connected devices (Fig. 3). The other Applet window is sent on the acquisition device and shows the video output of the running sketch.

The sketches (views) are implemented with Processing, a common Java class built for electronic arts, new media art, and visual design communities. Processing is available also with a standalone IDE [9].

To manage the video settings, the data received from the smartphones is filtered and used to update the opportune sketch parameters. For example, the number of times per second that the device #1 is shaked (integration of 3 axis acceleration), can affect the size of a related shape or its color gradient. To manage the audio, the server communicates via MidiBus to the midi device. When phone’s data is received, opportune commands are sent to the midi device according to the type of event triggered. For example, a tap on the device #1, triggers an event (and a command) that can play/stop an audio sample, play/stop an audio track, edit the track’s tempo or edit the parameters of the audio filter.

The communication with the phones, is described in detail later.

3.2 Mobile Application

The application builded on the smartphones is used as an interface for the server that generates sounds and visuals. The participants has to hold the device and do several sort of interaction, they can Jump, shake or touch the device.

In order to obtain a rich interaction with the system, mobile application, and mainly data collection and analysis, are a key factor. Many researches has been conducted in the field of gesture analysis in mobile phone [1013].

What the app does is:

  1. 1.

    To collect the user’s gestures and send the interactions data to the server

  2. 2.

    To give a feedback to the users, this functionality may seem trivial but has an important role since involve the people in a more intimate way. The whole experience is capable to collect the attention of the crowd, but the device interacts directly with each person and thus can generate a feedback directed to the individuals and not on the whole crowd.

In order to improve user interaction, some issues has been considered in the design process of the app:

  • The movements used to control the whole system have to be simple: because people have to interact naturally with the system without the need of training. Too complex interactions are to avoid because the people gets frustrated rapidly and loses interest in interacting.

  • The movements have to be powerful and expressive: we want to build a system with a deep interaction that can have many shades. So we try to avoid too simple movements that bring to a boring interaction.

These two principles could be in a conflict each others this demands to evaluate the trade of between having poor inexpressive movement and having movement too complex to understand by the user.

We came to a solution layering the interactions using both simple and more complex kind of interaction, dividing the interactions in such a way we could give to the crowd a straightforward way to interact with the ambient letting them discover a more deep interaction when they get bored with the simple one.

  • Layer 1 (Simple): touch of the screen, this interaction is the simplest one but distract the people from the performance. As soon as the user touch the screen in every position a touch event is triggered.

  • Layer 1 (Simple): shake, this interaction is aimed to detect particular kind of movement, the shakes occur when you drastically change the acceleration and velocity of the device. On each stationary point in the velocity function (inversion of movement) a shake is triggered.

  • Layer 2 (Complex): touch, this interaction is more structured than the simple one. The screen is divided in different sensitive areas and a different command is sent when a user touch a different area of the screen.

  • Layer 2 (Complex): tilt, this interaction is raised when you move the phone gently leaving the screen up. This interaction has two parameter: the tilt in both direction, x and y.

  • Side Layer: Raw data, the phone is capable to send also Raw acceleration data, this mode can be disabled when you deploy the application, is main purpose is for debugging but could be also used for implementing new behavior coding on the server.

The application is continuously logging the data generated from the touch screen and the accelerometer analyzing the incoming data and triggering the message listed above. For the touch is simple to know where and when the user has interact with the screen using the phones api. But for distinguish between shake and tilt event we use boolean membership function that indicate if the data are in the set shake or in the set tlt.

Shake Membership Function. For identifying the shake movement we use the concept of Magnitude, the magnitude for a vector of dimension three is defined as:

M = √z2+x2+y2

We accept the movement if they have a magnitude superior of 5.5, the value has been calculated from observation and testing of a person shaking. Below we report a chart showing the function.

In the Fig. 3 are reported a sample movement divided on the three axes the computed magnitude and the corresponding acceptance output.

Fig. 2.
figure 2

Server graphical interface

In the Fig. 4 it’s possible to see the acceptance of the function respect to the derivative of x, y and z. The data are normalized to 1. We can clearly see the action of the identification, when a movement with a sufficient magnitude is recorded over the axes the membership function recognize the concentrated energy of the movement..

Fig. 3.
figure 3

Sample movement divided in three axes.

Fig. 4.
figure 4

Sample movement integrated

Tilting Membership Function. For detecting the tilt action we use the derivative of Z, when the derivative is small enough and the vector Z is big enough in module: this two condition implies the user is holding the phone with the screen up and not shaking or doing other movements but tilting it.

3.3 Communication and Interaction Types

Whenever a user gesture (all the interaction are gestures) takes place, the device sends a message to the server through an UDP datagram. The UDP protocol has been chosen due to the low communication overhead needed and helps to keep the communication latency low, key factor for a real-time application.

Different datagram structures were used depending on different interactions. Each message is composed by the device identification number, the type of the interaction and the data of the interaction.

The intercepted interactions are:

  • touch: when the user presses or releases an area on the smartphone’s screen. The message also specifies the number of the used fingers and the touched area’s coordinates.

  • tilt: when the user gently tilts the phone. The message also specifies the tilt direction.

  • shake: when the user moving the phone in a direction, does a fast inversion of movement. The message also specifies the shake direction.

3.4 Graphical User Interface

We have choose to keep the UI very simple because the user hasn’t to be distracted from the interaction with the phones. Below you can see a mockup of the final application.

We design the application with a big colorful area that identify the Smart Phones, plus ad identification number. In the middle of the screen is printed a cursor that helps the user interacting with the app. At the start up the screen is black and the cursor is static but when the connection with the server is established the cursor start to pulse, showing that the app is now live (Fig. 5).

Fig. 5.
figure 5

Graphical user interface in a range of colors

The graphics elements on this minimal app have a fundamental role: they guide and help the user identify which one of the intraction listed before he or she is doing.

  • Touch: when the user touches the screen, the cursor is moved under the user finger, highlighting the touched area.

  • Tilting: when the user starts a tilt gesture, the screen of the phone changes color, going from example from being black to a vivid green (depending on the color chosen for the device) and the pulsing cursor starts to move like a steel ball that is balancing on the screen with the purpose of indicating to the users the exact value they are sending to the server.

  • Shaking: when the user starts a shake gesture, the screen makes a flash and vibrates. We chose to use an Haptic feedback because when the user shake the phone with an high probability he or she isn’t looking at the screen.

3.5 User Experience

On the night of the performance, at the event entrance hostesses gave out the controller smartphones to visitors interested in interacting with the exhibition. Each device came preloaded with Pyck Chroma, the Windows Phone application developed to transform the stand’ smartphones into an interface able to interact with the graphical and musical exhibition taking place at the location.

Up to six visitors at a time could control the way the light and music changed: by shaking and tilting the devices or by tapping on the screen of the stand’ smartphones, the users produced a variation in the generative graphics displayed on the MiSPHERE screen and a particular recognizable sound was emitted and added to the playing music base (Fig. 6).

Fig. 6.
figure 6

Generative graphic examples

Each user interaction with the device was translated into a visual or audio output. Every smartphone had an assigned effect on the exhibition and affected the show in an unique and unambiguous way by emitting an univocally assigned percussion sound for example, or by altering only the graphical elements of a specific color assigned to a specific phone and every user could thus easily recognize the way his action were controlling the light and sound exhibition. Different graphical themes were developed, each with it’s own coordinated sound effects set, and each designed to work in a similar and intuitive way from the point of view of the user. The participators were left with the feeling of being a DJ at a collaborative interactive happening.

The exhibition took place as part of the Milano Design Week (the most relevant designrelated event in Milan beside the Salone Del Mobile) from 8 to 13 of April 2014. The location was in the festival zone designated as “Tortona Around Design”, known as such from the name of the road it indicated, via Tortona.

A 12mt diameter and 6 mt hight cylinder made by a tent of lightbulbs (Barco MiSPHERE) was erected on a roundabout circulation. Every night, from sunset until 11.00 PM and for an entire week, people got a chance to interact with the system and try to control the atmosphere of the exhibition (Fig. 7).

Fig. 7.
figure 7

Users interacting with the system during user test and general set-up.

Since the event took place during a festival that as a whole is estimated to have attracted more than 300.000 people, there was a lot of affluence around the exhibit installation. A significant number of people thus participated to the collaborative happening, with an even greater audience watching their work.

To each participant was given a smartphone from a fair assistant, who quickly explained and demonstrated how to interact with the device to use it as a controller. “Teams” of participants interacted with the system for a couple of minute at a time.

Up to six different graphic and sound themes were used during the week. Some refinements to the themes were iteratively made with the purpose of achieving a better interaction quality.

4 Conclusion

In conclusion, smartphones can be regarded as an attractive and functional interface for designing engaging and interactive social experiences based on audience participation. Some consideration are however in order to acknowledge the different limitations and problematic we encountered in our experimentation.

We incurred in some technical difficulties with the design of the smartphones themselves: the lockdown buttons were located in a place that kept being bumped and hitten without purpose from the user, interrupting the signal from the devices to the server and severing the connection. Users were left feeling confused and this influenced their involvement and comprehension of the interaction.

Participants sometimes had a somewhat noticeable difficulty in identifying themselves and their action with the corresponding auditory and visual feedbacks of the system. This was due to a number of factors:

  1. 1.

    The screen was really big, for it had to be seen from the whole street, and it was meant to be seen from the outside, and from a considerate distance. The user instead found himself at the center of the circle and saw the lightbulb screen from the insideout. Both the shape of the screen and the excessive nearness to the screen left the user disoriented and unable to fully take in the effect of their action.

  2. 2.

    Even considering it’s large dimension, the tent resolution was really low (at about 360 x 60 pixels/lightbulbs) and it didn’t allow for the more elaborated shapes to be displayed. The graphics actually used had to be simple enough in nature to be displayed accurately by the tent while at the same time resulting recognizable and entertaining to the user.

  3. 3.

    On the audio front, since the communication isn’t synchronous, the instant in which the gesture is made on the device is not the same as the one in which the server elaborates the command sent from the phone to output the sound from the mixer. The two instants result as not being near enough for there to be a perception of true synchronicity between the interaction and the emittance of sound, and the delay between the action of the user and the reaction of the system remains noticeable and distracting.

Even considering these limitations, the event was a success and people genuinely enjoyed interacting with the system and were left feeling like main actors of a collaborative interactive show. Both kids and adults from all ages enjoyed playing in such a novelty way with the smartphones and had little to no trouble understanding the way the interaction worked. All the participants were completely awed by the musical and visual effects produced by their actions.