Crowdsourcing as a Social Interaction Tool to Stimulate Sustainable Transportation Mode Use

The empirical data were used to validate mode shift behaviors for 77 participants from California State University Long Beach. Data collection spanned over two phases, Phase I followed by Phase II. Each study phase lasted a month. Participants used one of the four modes – personal car, walking, bicycling and public transit to arrive at the university campus. During Phase I, a control group was created, and individual mode choice of participants were obtained. Individual participants in Phase II were assigned short-encrypted distinct names and were asked to post a daily comment on the quality of experience using the mode that was used to arrive at the campus. The participants were asked to post the comments over a “Twitter” page that was used as the crowdsourcing platform for this study. The encrypted name masked the individual identity of the user. Analysis at the end of Phase II showed that there was an overall mode-shift of almost 19% of personal car users to other sustainable modes of walking, bicycling and transit.


INTRODUCTION
Crowdsourcing is emerging as a significant social interaction tool to provide possible solutions to problems that are traditionally expensive to solve individually [1,2]. Crowdsourcing refers to the technique of gathering opinions and information from the crowd [3]. Therefore, when used effectively, crowdsourcing can use the public's intelligence and skills to solve complex issues [4]. The collection of information through crowdsourcing is often facilitated by outcomes are limited to providing insights at the individual level, which may not be uniform across all potential transport users. Although the information obtained through surveys is very useful for understanding transportation problems that exist locally or within the area surveyed, the policy implications and impacts can only be empirically justified. Thus, a suitable choice modeling technique can be deployed to understand the determinants that govern complex human mode choice among a set of available options and variables. To simplify the analysis, most of the research in this field can be further subdivided into qualitative and quantitative studies. An indepth understanding of both qualitative and quantitative aspects of people's perceptions, attitudes, and behaviors toward choosing between a private car over public transport helps make policy decisions justifiable. While qualitative methods directly allow assessment and explanations of an individual transport user's behavior and attitudes, quantitative methods draw implications and attitudes from a traveler's mode choice based on statistical data analyses [7].
Example applications of crowdsourcing consist of individuals able to express their thoughts and concerns on a topic over a common platform while also contributing to a diverse group of people different ideas in hopes of solving one or more specific problems [8]. Fig. 1 shows the visual example of how basic crowdsourcing is theorized to work, as well as some of the main steps that are typically involved in crowdsourcing [6]. This method of seeking a solution allows for a larger range of possible solutions being observed, while also increasing the participation of the public on a project. This method of participation assists many urban planners that tend to have difficulty in obtaining public involvement [9]. Furthermore, using crowdsourcing, one can make use of the information needed for planning decisions [10].
Crowdsourcing system types are divided into three typesbased on participation expertise or no expertise and based on time and location (Fig. 2). Participant expertise is important as this elicits a response from actual transport mode users, while general participation is useful when expertise is not an issue; however, an informative idea is formed about the utility of a system using a general participation of non-experts. A crowdsourcing system can further be classified based on whether the information that is sought at the same place or if the participants are at different places. Audience-centric (audience played games) and geocentric systems (route choice data) are the same place sub-systems, while event-centric (event-based) and global systems (such as Wikipedia) are different places sub-systems.  Cities tend to be perfect environments for crowdsourcingsince access to useful digital tools and individuals willing to share data are readily available [10]. The current usage of smartphones has increased to such an extent that by using all of the basic applications normally available in typical smartphones, the opportunity to obtain beneficial data and fill previous data gaps would increase substantially [1]. Some of the possible data which would be possible to obtain through the use of crowdsourcing would include travel behaviors of users, the current physical environment of the transportation system, improvement opportunities, public perceptions of new infrastructure projects, bicycle and foot trips, transportation demands within cities, cycling safety and routes, and contextual geographic information about current events in social media [6, 10, 14 -17]. Other possible applications of crowdsourcing include smart parking, ridership data, transit troubleshooting, road condition monitoring, and assessment, urban traffic planning and management, and many other issues involving big data [10,15,18].
In order to fully understand mode shifts, it is also important to consider behavioral aspects of users in mode choice decisions [19,20] This is in conjunction with a variety of other explanatory variables which constitute psychological, socioeconomic, and demographic factors in modeling [21,22]. The inclusion of psychological factors pertaining to human behavior clearly makes the mode-choice modeling complex [23], primarily to improve the structure of the utility function which leads to improvement in the goodness of fit with the psychological factors in the mode-choice analyses [23,24]. Proxies or dummy variables have often been used to include behavioral aspects of individuals involved in mode choice decisions [25]. Osman Idris et al. [23] addressed this behavioral complexity by employing a "multivariate statistical modeling approach to investigate the causal relationships between the underlying psychological aspects affecting mode choices such as habit, attitude and affective factors" and then consolidating the approach with the Theory of Interpersonal Behavior by Triandis [26]. Determinants that are often causal for mode switch between private cars and public transport are dependent on the mode available, such as light rail [27], walking/bicycling [28] and bus [29]. Thus, in summary, extensive studies have been carried out to identify and determine factors that cause an individual's preference for a private car over public transportation (and vice versa).
In this research, a mathematical model is developed using crowdsourcing as a social interaction tool to assess use of sustainable transportation modes such as walking, bicycling and public transit. The popularity of a mode is first modelled by gathering and evaluating opinions received from participants on a series of transportation modes. There are various ways of obtaining opinions such as those gathered through traditional blogs and/or using social media platforms (such as Twitter, Facebook etc.). In both these data gathering methods, there is an open mechanism of knowing opinions of others about a mode before presenting one's own views for a mode choice. This enables others to gauge the most popular and commonly used mode to travel to a destination. Thus, a participant can access comments/opinions posted by others on the platform, arranged in chronological order. In a way, the process amounts to assisting an individual's decision-making based on collective intelligence of the crowd [30].
Opinions shared socially on a common platform reflect one's preference from among the series of modes. In this process, there is a potential that a person's opinion might become influenced by a participant's prior experience or perception of the use of the mode and opinions of others on a mode. This illustrates decision-making skills of favoring or not favoring the use of a mode -using crowdsourcing in a social interaction setting.

MATERIALS AND METHODS
An analytical framework is developed for evaluating the popularity level of a mode. The popularity is based on positive or negative posts of the mode on a Twitter platform. Further, mode choice behavioral impacts are observed on participating individuals who read those posts. The following notations have been used for the actual mathematical formulation described below.  A perception score is developed in this research based on a group's capacity and player's fitness theory adopted by Guazzini et al. [31] to solve an increasingly challenging task. In theory proposed by Guazzini et al., for a given group, an integer parameter called the capacity is introduced, which is incremented successively by an integer if a task is solved. The basis was that there is an incremental nature of human advances, for which there is evidence of superlinear behavior through the chain of fitness gains. Thus, in this research, there are two mechanisms underlying the superlinear behaviorindividual scheme related to the skills of providing positive/negative comments on the performance of a transportation mode in a given time and the accumulated knowledge developed from all the comments over the history.
In developing the formulation for the perception score, the impact on the individual scheme and the capacity developed from the Y m target group is modeled for impact from comments from all the N individuals. Mode choice of individuals from the X m group is assumed to be rigid, irrespective of positive or negative opinions/comments posted by the participants.
Consider an incremental increase in capacity of an individual modeled for an initial time interval, t = 0 and there is incremental integer increase of the time interval, t = 1, 2, 3, …, and so on. Within each time interval, all positive and negative comments from N individuals about a mode are posted and eventually read by all participating individuals. Within each known time interval, a set of new positive and negative comments from all the N individuals are posted on the Twitter page. Each time interval is assumed to be large enough to accommodate comments from all the N individuals. The comments are assumed to impact ridership and usage of mode m for the N individuals. A high number of positive comments for a mode would indicate a higher number of usage and popularity of the mode. After a sufficient number of positive and negative comments have been received, no more changes in mode choice or usage of a mode occur. This means perception is developed among all N individuals after some initial corrections or adjustments in their comments with respect to their mode choice.
In this paper, the individuals from the group are treated as the focus group, as it is assumed that any change in mode shift will occur from among this group in a social set-up influenced due to postings on a mode use experience crowdsourced over the Twitter page. Therefore, the goal is to evaluate if any individual from Y m would switch to use mode m because of being influenced due to the positive posts on the page. Alternatively, individuals from Y m might discontinue using mode m after being influenced by the negative posts by any of their peers from the Y m group.
If positive perception of mode m, f Ym,p > f Ym,n negative perception of mode m, over a period of time, Y m individuals using mode m will increase.
Within a crowdsourcing set-up a certain group of individuals notice social behavior of their peers and are influenced from growing popularity of an object, commodity, etc. and thus, begin to align their perception/opinion/actions aligned with the majority of the crowd with time [6,32]. This basic idea is exploited in this research to build a framework of allocating quantitative scores to the choice of a mode by an individual, every time the individual from X m (or, Y m ) sees a positive (or, negative) post about the mode . A positive post consolidates faith of the individuals who are already in favor of mode m, while a negative post may partially deteriorate the favorable opinions the individuals might have on mode m.
The first individual from the Y m group develops the intelligence capacity or perception score equal α × f Ym,p to which accounts for the positive perception of the individual for mode m. The perception could be developed before or after the posts on the Twitter page is read. Consequently, the quantitative perception equal to (1-α)(f Ym,n ) accounts for the perception of the same first individual for the negative comments. The second individual from the Y m group builds a quantitative perception of value equal to α (f Ym,p + 1) and (1-α) (f Ym,n + 1) for positive and negative posts, respectively. This is because the post from the first individual from Y m has influenced the perception of the second individual. Similarly, the n th individual from Y m builds a positive perception score equal to α {f Ym,p + (n -1)} for mode m for the positive posts, and for the negative posts, a score of (1-α) {f Y m,n + (n -1)} is assigned against mode m.
Therefore, the expression for the perception score, S m,n , developed by an individual from Y m individuals who is at the n th observing position of posts for mode m is expressed as: (1) Parameters f Y m,p and f Y m,n which are quantified perceptions of mode m from individuals belonging to Y m group are assumed to be constant when equilibrium is reached and no more mode shift occurs irrespective of positive or negative posts from the N individuals. At the end of each time-period, the summation of total score (termed as the crowd-based perception score, Φ m ) is obtained for the mode m. Φ m is a simple summation of S m,n with n = 1, 2,..., Y m , calculated across all the individuals in the group. The value of Φ m serves as the proxy of the overall popularity of mode m.

RESULTS
The application of the perception score is being illustrated using real life data collection and analysis. The goal of the application example is to understand at what instance a mode can potentially become popular among individuals. The empirical exercise is carried out to assess any mode shift observed among four different transportation modes -car, transit bus, bicycling and walking -for 77 students from California State University, Long Beach (CSULB). These four modes are commonly used by CSULB students to arrive at the campus. The data collection was carried out spanning over a two-month period divided into two phases of one month each -Phase I and Phase II. Only those participants were included in this study who could easily choose more than one mode to arrive at the campus. Furthermore, participating students were required to be on campus at least once a week from Monday to Friday, own or can use a smartphone, and be able to either use personal car, public transport, walk, or bike to arrive at the campus.
The weather throughout the data collection effort was always perfect for using all four modes of transport. Therefore, the weather did not influence the mode choice of the participants. Phase I was carried out for the month of October of year 2018 and Phase II was carried out during November and December of 2018. For both the phases, data collection was made only for the weekday travels of the participants to the CSULB campus. During Phase I, the mode choice of each participant was collected every Tuesday, Thursday, and Friday through emails and was not shared among the participants.
A random number of students were selected from the college of engineering at CSULB. To ensure maximum random sample of participants that were selected, no two participants belonged to the same level of study (freshman, sophomore, junior or senior). In addition, it was ensured that the usual arrival times to campus of the participants belonging to the same major differed by at least two to three hours for a given weekday. There was an approximately equal number of male and female students participating in this study.
During Phase II, each participant was provided with a short-encrypted name (with random four-letter first and also four-letter last name) to be used to post any tweets on a Twitter account created and managed specifically for this research. This was done to mask and protect the privacy of the participants from knowing each other while making any posts on the Twitter account. All unrelated, unwanted or differing posts other than the ten mentioned below, were deleted from the Twitter page as soon as they were found out by the researchers. A warning was also issued to the violator via email against any frivolous and unrelated postings other than those ten phrases. The researchers served as the administrator and the owner of the Twitter account. The students were asked to post short phrases on the controlled Twitter account for the mode used to reach the campus on weekdays (Monday-Friday) along with a score rating the mode. The students were provided these ten phrases, as mentioned below: 1. "Heavy traffic to campus" -and any similar phrase that would indicate the traffic scenarios is not conducive to driving 2. "Light traffic to campus" -and any similar phrase that would indicate the traffic is conducive to driving 3. "Difficulty in finding parking" -and any similar phrase that would indicate that the parking lots on campus are full 4. "Found parking" -and any similar phrase that would indicate that there were parking spaces available on campus 5. "Enjoyed bus ride" -and any similar phrase that would indicate easy access to transit bus to campus 6. "Bus ride was rough" -and any similar phrase that would indicate easy access to transit bus to campus 7. "Enjoyed biking"-and any similar phrase that would indicate that biking experience to the campus was a great experience 8. "Biking was rough" -and any similar phrase which would indicate that biking experience to the campus was not a good experience 9. "Enjoyed walking" -and any similar phrase that would indicate that walking to campus was a good experience 10. "Walking woes" -and any similar phrase that would indicate that walking experience to campus was not a good experience For the above ten phrases, positive and negative interpretations are tabulated in Table 1. Each phrase had at least one positive interpretation.
The ten phrases were selected such that they also included parking problems for participants driving to school, as expressed using phrase 4 "Found parking". These phrases provided positive and negative experiences about a mode by a participant under Phase II. Thus, it was anticipated that this new information on a mode being used in real time would then affect the mode choice of other participants.
The data collected for Phase II occurred mostly during the month of November, along with the first week of December, which was done to offset the one week break for the Thanksgiving holiday.
Before the commencement of Phase I, participants were required to provide their positive and negative perception values for each day of the week through email. The positive perception accounted with f Ym,p and the negative perception with f Ym,n . Thus, a participant was asked to provide a score between lowest possible score of 0 to maximum possible score of 10 as a means for estimating f Ym,p and f Ym,n . The average values for f Ym,p and f Ym,n are shown in Table 2 in the beginning of Phase I. It was observed that for the two modes -car and and walkingthe positive perception of using these modes was higher than the negative perception of using the modes. The information is compiled in Table 2.

Empirical Results
The mode choice of participants was obtained under Phase I and Phase II and compiled for each duration of the phase, as shown in Table 3. The data in Table 3 under Phase I shows the mode split of the participants without any influence of crowdsourcing or outside knowledge of mode choice of other participants. The mode choice of car has the highest percentage use in while the bus was the second most used mode of transportation by the participants to arrive at CSULB campus during Phase I.   The chart in Fig. 3 shows the variation of crowd-based perception score, Φ m , versus percentage of users for the four modes. It is noted that as the value of Φ m increases, there is also an increase in the number of users for the modes. Car has the largest value for Φ m , with the highest percentage of users, while bicycling has the smallest Φ m with the lowest percentage of users. Thus, this empirical analysis serves as a validation for the crowd-based perception score developed in this research. City planners and stakeholders can improve ridership and user frequencies of sustainable transportation modes (such as transit, bicycling etc.) by utilizing the crowdsourcing technique of soliciting public opinions, which are transparent and are on specific infrastructure facilities. Furthermore, using the crowd-based perception score developed in this research, the future popularity of the modes can also be assessed.

DISCUSSION
The crowd-based perception score Φ m can be used to make decisions regarding the popularity of a mode. Higher the value of Φ m the higher the popularity of the mode m is among Y m individuals. With a varying value for α, a closed-form expression for optimal Φ m can be obtained using 1 st and 2 nd derivatives of Φ m .
The 1 st derivative of Φ m with respect to α is expressed as,  Fig. (3). Variation of percentage mode users with crowd perception score for Phase II.
Although two roots are possible for the quadratic expression resulting from the 1 st derivative, the other root does not maximize the perception score as shown in the Appendix. The second derivative of Φ m with respect to α gives, As evident from Eq. (4), in order to maximize the crowdbased perception score Φ m for given mode m, it is observed 0≤a≤1 that and the second derivative of Φ m with respect to α is a negative number. This gives a very useful information on developing an understanding of the impact of α on popularizing a mode based on the technique of crowdsourcing. A summary of information presented in Table 4 provides conditions on α to maximize Φ m for a given mode m. Detailed derivations for the summary in Table 4 are shown in the Appendix. Note that the above optimization exercise can work only with a set of individuals from Y m that tend to follow the crowd in decision-making. In Table 4, variables f Ym,p , f Ym,n and α are considered as components of crowdsourcing. Approximate values of and can be determined using survey findings before individual scores from X m and Y m are sought on the perceptions of mode m. Perceptions about the use of a mode can be negative or positive. Every individual in X m provides an initial perception rating for a mode from 0 to N, and the higher the rank more inclined is the individual to use the mode. The average of those ranks across all X m (or Y m ) individuals gives the average range Φ m for f Ym,p (or f Ym,n ) which is (, N). The analytical model developed in this paper can be used to determine if there could be future potential for enhancing the popularity of sustainable transportation modes such as transit, bicycling etc.

Sensitivity Analysis with Crowdsourcing Components
An example illustration is presented in the chart of ( Fig. 4) to show the impact on crowd-based perception score Φ m of a mode m for assumed values α varying from 0 to 1 at an interval of 0.02. The chart is for N = 100, with assumed values of parameters f Ym,p and f Ym,n as shown in Table 2. The values of f Xm show a decreasing trend of the perception of using mode m (and the values of show an increasing trend towards the perception of not using the same mode m) across all four scenarios shown in Table 5. Scenarios A and B show that Y m individuals have a much higher negative perception and low positive perception of using the mode, which indicates that the perception scores will be quite low. Scenarios C and D show that Y m individuals have a much higher positive perception and low negative perception for not using the mode, which indicates that the perception scores will be quite high for these two scenarios.
, ,  Fig. (4). Variation of the perception score versus α. The observations from the chart in Fig. 4 show the potential of any enhancement in the perception score of mode m among Y m individuals. Perception score is used as a proxy to popularity, attitude or perception of mode m, which indicates that at α = 0.89, 0.63, 0.38 and 0.26, negative perception of using the mode changes to a positive one for Scenarios, A, B, C and D, respectively. Scenarios A and B show that with a negative perception of using mode m (i.e.) being the maximum, Scenario B attains a positive perception score at α = 0.63 (with lower X m , higher Y m ) before Scenario A which attains a positive perception score at α = 0.89 (with higher X m , lower Y m ). This is due to the positive perception of using mode m (i.e. f Ym,p ) being higher for Scenario B as compared to that of Scenario A. This is expected since a higher positive perception of a mode will accelerate its acceptance among individuals who have an overall negative perception.
The positive perception (f Ym,p ) of using mode m is the maximum possible for both Scenarios C and D, while the negative perception (f Ym,n ) of Scenario D is lower as compared to Scenario C. Thus, a positive value of perception score Φ m for Scenario D is reached even at a lower value of α = 0.26 (with lower X m , higher Y m ) before Scenario C which has a positive value for the perception score at α = 0.38 (with higher X m , lower Y m ). The expected score signifying the potential of acceptance and using mode m is higher in case of Scenario D for all values of α when compared to Scenario C.
The sensitive analysis carried out can be extremely useful for transit agencies and managers in understanding perception levels of transit from among potential users who do use transit at all. Thus, the methodology of crowdsourcing illustrated above can be used to successfully assess the level of popularity of transit that can be achieved by increasing its ridership.

CONCLUSION
Crowdsourcing is emerging as a powerful tool in transportation, particularly for travel management and routing decisions. For example, Cyclopath (which is a geo-wiki where bicycle users in Minnesota share a note about bike lane and trail conditions on an editable map) is being used to crowdsource information about missing parts or trails on a lane to fellow bicyclists [33]. Other examples include smartphonebased applications, such as Google Maps, which provide dynamic routes to roadway users by crowdsourcing [34].
This paper provides an application of a model with a crowd-based perception score developed. The perception score is used to study any potential mode-shift behaviour among college students soliciting comments and opinions on mode used to arrive at the CSULB campus. Based on the outcome of this study, albeit for a small sample size, it is shown that the crowd-based perception score can potentially predict the future ridership to a certain extent. The percentage of various transportation mode users with intervention provided using crowdsourcing increased with an increase in the value of the crowd-based perception score.
The findings of this research can be further validated by increasing the participant pool in this crowdsourcing exercise. The results clearly will have some very wide-spread implications beyond the college settings in popularizing modeshifts to transit and other active transportation modes if appropriate social media and information sharing mechanisms through crowdsourcing are provided to transport users. However, there are limitations on the use of crowdsourcing as a technique for data collection purposes [35]. In situations when a problem that needs to be addressed is not clearly defined, crowdsourcing may not occur [36]. There may also be issues related to acquiring and integrating unsolicited ideas with crowdsourcing [37]. Therefore, these limitations need to be kept in mind before conducting a full-fledged data collection exercise in the expectation of influencing mode choice.

CONSENT FOR PUBLICATION
Not applicable.

AVAILABILITY OF DATA AND MATERIALS
The authors confirm that the data supporting the findings of this study are available within the article.

FUNDING
None.

CONFLICT OF INTEREST
The authors declare no conflict of interest, financial or otherwise.

ACKNOWLEDGEMENTS
Declared none.

APPENDIX
Optimal crowd-based perception score Φ m A.1. Evaluation for maximum with Φ m with 0≤a≤1 (i) Given, , which is always true since factors f Ym,p , f Ym,n , λ and N are by design and assumptions all non-negative numbers.