Abstract
Numerous studies in recent months have proposed the use of linguistic instruments to support requirements analysis. There are two main reasons for this: (i) the progress made in natural language processing and (ii) the need to provide the developers of software systems with support in the early phases of requirements definition and conceptual modelling. This paper presents the results of an online market research intended (a) to assess the economic advantages of developing a CASE (computer-aided software engineering) tool that integrates linguistic analysis techniques for documents written in natural language, and (b) to verify the existence of the potential demand for such a tool. The research included a study of the language – ranging from completely natural to highly restricted – used in documents available for requirements analysis, an important factor given that on a technological level there is a trade-off between the language used and the performance of the linguistic instruments. To determine the potential demand for such tool, some of the survey questions dealt with the adoption of development methodologies and consequently with models and support tools; other questions referred to activities deemed critical by the companies involved. Through statistical correspondence analysis of the responses, we were able to outline two “profiles” of companies that correspond to two potential market niches, which are characterised by their very different approach to software development.
Similar content being viewed by others
Notes
Multi-year project funded by the Department of Computer and Management Sciences of Trento University.
Some comparisons deriving from our research are described in [1].
For further study of issues related to online market research, the interested reader can refer to the literature (see for example, the publications found at ESOMAR – European Society for Opinion and Marketing Research – http://www.esomar.org).
A bibliography is available at http://nl-oops.cs.unitn.it.
The first proposals to use linguistic criteria for the extraction of entities and relations, and then objects and associations, from narrative descriptions of requirements date from the 1980s [7].
For example, to recognise if Washington is the name of a person, of an airport, or of a city in a given document requires a semantic approach. Limitations on space do not permit a deeper discussion of this issue here; see for example [12].
For a recent study on why it is impossible for users to know their requirements beforehand, see [16].
On this point, see, for example, the tasks required by the MUC competitions (Message Understanding Competition) organised by the DARPA (Defense Advanced Research Projects Agency) [17].
The official documents of the UML’s specifications can be found on the OMG (Object Management Group) website: http://www.omg.org.
Natural Language – Object-Oriented Production System, http://nl-oops.cs.unitn.it.
The questionnaire is available along with the data gathered and other related research material at http://on-line.cs.unitn.it.
For example, a questionnaire like the one used for the survey described in [23] would have to be radically altered to be used online.
In light of the observations in [24], this may not be so surprising.
The choice of tools for question 14 was made on the basis of sales data for a period prior to the study.
Because the survey concluded at the end of the Arena opera season, the tickets were replaced with CDs of opera music by Verdi.
One of the aims of the survey, in fact, was to investigate the conditions under which newsgroups can be used to carry out online surveys.
Limited number of questionnaires obtained (44) and accusations of spamming.
This is a rather high percentage, bearing in mind that they were collected from the homepages of official company websites. Another survey carried out in the same period on winter tourism, where the addresses were provided by a specialized magazine, found a very similar percentage of wrong addresses (8.9%), but the amount can be much higher. For example, in a survey of Internet users carried out in 1996, 35% of a total 1221 addresses were found to be wrong [26].
This was the minimum value for the traditional-type surveys, which achieved a maximum response rate of 20%. In the survey described by Glass and Howard [25], the percentage rose to 17% after the questionnaire mailings were supplemented by telephone contacts with fax follow-up.
For a survey on virtual supermarkets, a message was sent to 6 newsgroups obtaining 100 completed questionnaires.
All the percentages were calculated on the total number of respondents who answered the relative questions, with non-replies omitted.
Further investigation of this aspect would require knowledge of the number and size of the companies’ customers. This, however, is beyond the scope of our survey.
For an introduction to the evaluation of potential demand, see for example, [27].
In the past, the need to support different graphic notations was a drawback to the market for CASE, in that it required producers to choose which notation to support with their own tools, or to absorb the higher cost of developing different versions.
A CASE based on linguistic techniques for object-oriented analysis does not necessarily require the realisation of an entire support environment, but rather can be seen as a module that can be integrated with an existing product.
A study of the ‘robustness’ is of utmost importance also to establish the degree of analyst intervention required in developing requirements models, and should be conducted using a prototype of the tool. See also point (b) of the introduction and conclusion.
None of the tools indicated by those choosing the option ‘Other’ was selected more than twice.
International Data Corporation (IDC) data.
These figures seem to contradict the results of the survey by Glass and Howard [25], where CASE technologies are described as being in decline. However, it should be pointed out that where back-end or ‘lower’ CASE technologies are concerned, many of the functions offered by these tools are by now part of the development environment. Moreover, other expressions are often used instead of ‘CASE’: for example, the IDC surveys use OOAMDC (object-oriented analysis, modelling, design, and construction) tools. On the other hand, in 1998 the market for OOAMDC grew by more than 10% (24% in Europe). See also the results in [29].
It should be pointed out, however, that the data of our survey are expressed in terms of units of output by the companies surveyed, while the sales figures are calculated on invoices and consequently depend on the prices charged by vendors.
To be noted is that also around one-third of the final observations concerned the role and importance of requirements. Taking into account the different goals of the surveys described in [30, 31], we can compare these results with those obtained for a question therein on the perceived relative importance of software problems in Europe (most of the software problems are in the area of requirements specification and managing customer requirements; following documentation and testing) and on the perceived scope of a generic process model (defining system requirements, 78%).
In this regard we quote a remark made in one of the questionnaires: “I hate to be a cynic, but there are hardly any worthwhile tools. The overhead in learning to use them is too great for the payoff.”
The contingency table is available at http://on-line.cs.unitn.it.
Notable exceptions are the surveys conducted by the European Software Institute: http://www.esi.es.
These surveys were carried out with different objectives and using different methods and samples. The survey described in [25] used 78 questionnaires compiled mainly by directors or managers of information systems development in companies operating outside the software field, while the Finnish one reports results relative to 12 Finnish companies, 8 of which worked exclusively in the software field.
References
Franch M, Mich L, Osti L (2000) Online research as decision tool for marketing and management strategies. In: Gan R (ed) Proceedings of the Information Technology for Business Management – ITBM2000, 16th IFIP WCC, Beijing, China, 21–25 August 2000, pp 737–743
D’Elia M (2000) On-line market research: an application to the software domain (in Italian). Degree Thesis, University of Trento
Loucopoulos P, Karakostas V (1995) System requirements engineering. McGraw-Hill
Chiocchetti N, Mich L (2000) The market for object-oriented CASE tools (in Italian). Tech Report, Department of Computer and Management Sciences, University of Trento
Burg JFM (1997) Linguistic instrument in requirements engineering. IOS, Amsterdam
Ryan K (1992) The role of natural language in requirements engineering. IEEE, pp 240–242
Chen PP-S (1983) English sentence structure and entity-relationships diagrams. Inf Sci 29:127–149
Ambriola V, Gervasi V (1999) An environment for cooperative construction of natural-language requirements bases. In: Proceedings of the 8th ICRE. IEEE Computer Society Press, pp 124–130
Juristo N, Moreno AM, Lòpez M (2000) How to use linguistic instruments for OO analysis. IEEE Softw 17(3):80–89
Fuchs NE, Schwitter R (1996) Attempto controlled english. In: CLAW ‘96, 1st international workshop on controlled language applications, Katholieke Universiteit, Leuven, Belgium
Delisle S, Barker K, Biskri I (1999) Object-oriented analysis: getting help from robust computational linguistic tools. In: Friedl G, Mayr HC (eds) Proceedings of the 4th International Conference on NLDB ‘99, Klagenfurt, Austria, 17–19 June 1999: Application of natural language to information systems (OCG Schriftenreihe 129), pp 167–172
Mich L, Garigliano R (2000) Ambiguity measures in requirements engineering. In: Proceedings of ICS 2000 16th IFIP WCC, Beijing, China, 21–25 August 2000, pp 39–48
Davis AM (1998) The harmony in rechoirments. IEEE Softw, March/April:6–8
Nitto E Di, Fuggetta A (1995) Change vs consolidation: a challenge for SW development organisations. Riv Inf AICA 25(4):267–279
Mylopoulos J (1998) Information modeling in the time of the revolution. Inf Syst 23(3–4):127–156
Rugg G, Hooper S (1999) Knowing the unknowable: the causes and nature of changing requirements. In: Eder J, Maiden N, Missikoff M (eds) Proceedings of the 1st International Workshop EMRPS ‘99, Venice, 25–27 September 1999, pp 183–192
AAA Message Understanding Conference (1991, 1992, 1993, 1995, 1998) Proceedings MUC-3, MUC-4, MUC-5, MUC-6, MUC-7. Morgan Kaufmann. http://www.itl.nist.gov/iaui/894.02/related_projects/muc/index.html
Fabbrini F, Fusani M, Gervasi V, Gnesi S, Ruggieri S (1998) Achieving quality in natural language requirements. In: Proceedings of International SW Quality Week, Francisco, CA, May 1998
Laitenberg O, Atkinson C, Schlich M, El Emam K (2000) An experimental comparison of reading techniques for defect detection in UML design documents. J Syst Softw 53:183–204
Canzano G (1999) Natural language processing in market research: automatic analysis of replies to open-ended questions (in Italian). Degree Thesis, University of Trento
Mich L (1996) NL-OOPS: from natural language to OO requirements using the natural language processing system LOLITA. In: J Nat Language Eng 2(2):161–187
Mich L, Garigliano R (1999) The NL-OOPS project: OO modelling using the NLPS LOLITA. In: Friedl G, Mayr HC (eds) Proceedings of the 4th International Conference on NLDB ‘99, Klagenfurt, Austria, 17–19 June 1999: Application of natural language to information systems (OCG Schriftenreihe 129), pp 215–218
Nikula U, Sajaniemi J, Kaelviaeinen H (2000) A state-of-the-practice survey on requirements engineering in small- and medium-sized enterprises. Research Report 1, Lappeenranta University of Technology
Zvegintzov N (1998) Frequently begged questions and how to answer them. IEEE Softw 15(2):93–96
Glass R, Howard A (1998) Software development state-of-the-practice. Managing Syst Dev June:7–8
Comley P (1996) The use of the Internet as a data collection method. SGA Market Research, 1996
Wheelwright SC, Makridakis S (1985) Forecasting methods. Wiley, New York
Greenacre JM (1984) Theory and application of correspondence analysis. Academic Press, New York
Dutta S, Lee M, Van Wassenhove L (1999) Software engineering in Europe: a study of best practices. IEEE Softw 16(3):82–90
ESI (1996) ESPITI, European user survey analysis. European Software Insitute, Spain, Nov
ESI (1998) System engineering in Europe. Survey: summary of results. European Software Insitute, Spain, Aug
van Genuchten M (1991) Why is software late? An empirical study of reasons for delay in software development. IEEE Trans SWE 17(6):582–590
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag 6(2):559–572
Melchisedech R (1998) Investigation of requirements documents written in natural language. Require Eng 3:91–97
ESI (1997) Software best practice questionnaire, analysis of results. European Software Insitute, Spain, Dec
Author information
Authors and Affiliations
Corresponding author
Additional information
An erratum to this article can be found at http://dx.doi.org/10.1007/s00766-004-0195-3
Appendices
Appendix A:
1.1 Questionnaire for a new CASE tool
Appendix B:
1.1 Online material
-
Questionnaire (html form)
-
Contacted newsgroups list
-
E-mail messages
-
Correspondence analysis
Rights and permissions
About this article
Cite this article
Luisa, M., Mariangela, F. & Pierluigi, N.I. Market research for requirements analysis using linguistic tools. Requirements Eng 9, 40–56 (2004). https://doi.org/10.1007/s00766-003-0179-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00766-003-0179-8