Acta Orthopaedica Volume 92, Issue 4, August 2021 by Acta

4/21 ACTA ORTHOPAEDICA

Medical

IMPROVE THE CHANCES

REDUCE RISK FOR INFECTION Reduction of infection risk* using dual antibiotic-loaded bone cement in high risk patients

in aseptic revision TKA * as reported in study results

10453

www.heraeus-medical.com

in fractured neck of femur

Vol. 92, No. 4, 2021 (pp. 371–500)

34 % 69 % 57 %

in primary hip & knee arthroplasty

Volume 92, Number 4, August 2021

COVER.indd 1

02-07-2021 16:46:37

COVER.indd 2

02-07-2021 16:46:44

Acta Orthopaedica is owned by the Nordic Orthopaedic Federation and is the official publication of the Nordic Orthopaedic Federation

EDITORIAL OFFICE

Acta Orthopaedica Department of Orthopedics Lund University Hospital SE–221 85 Lund, Sweden E-mail: acta.ort@med.lu.se Homepage: http://www.actaorthop.org

THE FOUNDATION BOARD OF THE NORDIC ORTHOPAEDIC FEDERATION AND ACTA ORTHOPAEDICA

EDITOR

Anders Rydholm Lund, Sweden DEPUTY EDITOR

Peter A Frandsen Odense, Denmark CO-EDITORS

Li Felländer-Tsai Stockholm, Sweden Nils Hailer Uppsala, Sweden Ivan Hvid Oslo, Norway Urban Rydholm Lund, Sweden Bart A Swierstra Wageningen, The Netherlands Eivind Witsø Trondheim, Norway Rolf Önnerfält Lund, Sweden

Peter Frandsen Denmark Ragnar Jonsson Iceland Heikki Kröger Finland Anders Rydholm Sweden Kees Verheyen the Netherlands

WEB EDITOR

Magnus Tägil Lund, Sweden STATISTICAL EDITOR

Jonas Ranstam Lund, Sweden Philippe Wagner Västerås, Lund PRODUCTION MANAGER

Kaj Knutson Lund, Sweden

Vol. 92, No. 4, 2021

SUBSCRIPTION INFORMATION Acta Orthopaedica [print 1745-3674, online 1745-3682] is a peerreviewed journal, published six times a year plus supplements by Taylor & Francis on behalf of Nordic Orthopaedic Federation.

Airfreight and mailing in the USA by agent named WN Shipping USA, 156-15, 146th Avenue, 2nd Floor, Jamaica, NY 11434, USA. Periodicals postage paid at Jamaica NY 11431.

Annual Institutional Subscription, Volume 91, 2021

US Postmaster: Send address changes to Acta Orthopaedica, WN Shipping USA, 156-15, 146th Avenue, 2nd Floor, Jamaica, NY 11434, USA.

$1,291

£798

€1,035

The subscription fee purchases an online subscription. The price includes access to current content and back issues to January 1997 (if available). Printed copies of the journal are provided on request as a free supplementary service accompanying an online subscription. Supplements to the journal are also included in the subscription price. For more information, visit the journal’s website: http://www.tandfonline.com/IORT Manuscripts should be uploaded at http://www.manuscriptmanager.com/ao/ for further handling at: Acta Orthopaedica Editorial Office, Department of Orthopaedics, Lund University Hospital, SE-221 85 Lund, Sweden Correspondance concerning copyright and permissions should be sent to: Maria Montzka, Portfolio Manager – Medicine P.O. Box 3255, SE-103 65 Stockholm, Sweden, Tel: +46 (0)760 14 24 68. Fax: +46 (0)8 440 80 50. E-mail: maria.montzka@informa.com Ordering information: Please contact your local Customer Service Department to take out a subscription to the Journal: USA, Canada: Taylor & Francis, Inc., 530 Walnut Street, Suite 850, Philadelphia, PA 19106, USA. Tel: +1 800 354 1420; Fax: +1 215 207 0050. UK/ Europe/Rest of World: T&F Customer Services, Informa UK Ltd, Sheepen Place, Colchester, Essex, CO3 3LP, United Kingdom. Tel: +44 (0) 20 7017 5544; Fax: +44 (0) 20 7017 5198; Email: subscriptions@tandf.co.uk Dollar rates apply to all subscribers outside of Europe. Euro rates apply to all subscribers in Europe except the UK and Republic of Ireland. If you are unsure which applies, contact Customer Services. All subscriptions are payable in advance and all rates include postage. Journals are sent by air to the USA, Canada, Mexico, India, Japan and Australasia. Subscriptions are entered on an annual basis, i.e., January to December. Payment may be made by sterling check, US dollar check, euro check, international money order, National Giro, or credit card (Amex, Visa and Mastercard). Back issues: Taylor & Francis retains a two-year back issue stock of journals. Older volumes are held by our official stockists to whom all orders and enquiries should be addressed: Periodicals Service Company, 351 Fairview Ave., Suite 300, Hudson, New York 12534, USA. Tel: +1 518 537 4700; fax: +1 518 537 5899; e-mail: psc@periodicals.com.

Subscription records are maintained at Taylor & Francis Group, 4 Park Square, Milton Park, Abingdon, OX14 4RN, United Kingdom. Copyright © 2021 The Author(s). Published by Taylor & Francis on behalf of the Nordic Orthopedic Federation. This is an Open Access article distributed under the terms of the Creative Commons Attribution-Non-Commercial License (https://creativecommons.org/licenses/by-nc/3.0 . Informa UK Limited, trading as Taylor & Francis Group makes every effort to ensure the accuracy of all the information (the “Content”) contained in its publications. However, Informa UK Limited, trading as Taylor & Francis Group, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Informa UK Limited, trading as Taylor & Francis Group. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Informa UK Limited, trading as Taylor & Francis Group shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. Terms & Conditions of access and use can be found at http://www.tandfonline. com/page/terms-and-conditions Indexed/abstracted in: Allied and Complementary Medicine Library (Amed); ASCA (Automatic Subject Citation Alert); Biological Abstracts; Chemical Abstracts; Cumulative Index to Nursing and Allied Health Literature(CINAHL); Current Advances in Ecological and Environmental Sciences; Current Contents/Clinical Medicine; Current Contents/Life Sciences; Developmental Medicine and Child Neurology; Energy Research Abstracts; EMBASE/ Excerpta Medica; Faxon Finder; Focus On: Sports Science & Medicine; Health Planning and Administration; Index Medicus/MEDLINE; Index to Dental Literature; Index Veterinarius; INIS Atomindex; Medical Documentation Service; Nuclear Science Abstracts (Ceased); Periodicals Scanned and Abstracted. Life Sciences Collection; Research Alert; Science Citation Index; SciSearch; SportSearch; Uncover Veterinary Bulletin. Printed in England by Henry Ling

Acta Orthopaedica

ISSN 1745-3674

Vol. 92, No. 4, August 2021 Editorial “There was no difference (p = 0.079)”

Annotation Survey of the specializing doctor training in orthopedics and traumatology across university hospitals in Finland COVID-19 Impact of the COVID-19 pandemic on emergency and elective hip surgeries in Norway Does the Covid-19 pandemic affect ankle fracture incidence? Moderate decrease in Sweden AI Availability and reporting quality of external validations of machine-learning prediction models with orthopedic surgical outcomes: a systematic review Deep neural networks with promising diagnostic accuracy for the classification of atypical femoral fractures Shoulder arthroplasty Thromboprophylaxis in primary shoulder arthroplasty does not seem to prevent death: a report from the Norwegian Arthroplasty Register 2005–2018 Hip, femur Prior hip arthroscopy does not affect 1-year patient-reported outcomes following total hip arthroplasty: a register-based matched case-control study of 675 patients Dislocation of hemiarthroplasty after hip fracture is common and the risk is increased with posterior approach: result from a national cohort of 25,678 individuals in the Swedish Hip Arthroplasty Register Precision of CT-based micromotion analysis is comparable to radiostereometry for early migration measurements in cemented acetabular cups Sex differences in incidence rate, and temporal changes in surgical management and adverse events after hip fracture surgery in Denmark 1997–2017: a register-based study of 153,058 hip fracture patients Proton-pump inhibitors are associated with increased risk of prosthetic joint infection in patients with total hip arthroplasty: a case-cohort study Cost utility analysis of intramedullary nailing and skeletal traction treatment for patients with femoral shaft fractures in Malawi Knee, ankle Outcomes after arthroscopic revision surgery for anterior cruciate ligament injuries A projection of primary knee replacement in Denmark from 2020 to 2050 Technical note: 1-stage total knee arthroplasty and proximal tibial non-union correction using 3-D planning and custom-made cutting guide Antiresorptive treatment and talar collapse after displaced fractures of the talar neck: a long-term follow-up of 19 patients Children, etc. Determining the development stage of the ossification centers around the elbow may aid in deciding whether to use ESIN or not in adolescents’ forearm shaft fractures Below-elbow cast sufficient for treatment of minimally displaced metaphyseal both-bone fractures of the distal forearm in children: long-term results of a randomized controlled multicenter trial

371

J Ranstam

373

S Ojala, H Kröger, J Leppilahti, J Paloneva, and J Sirola

376

K Magnusson, J Helgeland, Grøsland, and K Telle

381

E M Rydberg, M Möller, J Ekelund, O Wolf, and D Wennergren

385

O Q Groot, B J J Bindels, P T Ogink, N D Kapoor, P K Twining, A K Collins, M E R Bongers, A Lans, J H F Oosterhoff, A V Karhade, J-J Verlaan, and J H Schwab G Zdolsek, Y Chen, H-P Bögl, C Wang, M Woisetschläger, and J Schilcher

394 401

R M Hole, A M Fenstad, J-E Gjertsen, S A Lie, and O Furnes

408

I Lindman , J Nåtman, A Öhlin, K Svensson Malchau, L Karlsson, M Mohaddes, O Rolfson, and M Sansone

413

A Jobory, J Kärrholm, S Hansson, K Åkesson, and C Rogmark

419

C Brodén, O Sandberg, H Olivecrona, R Emery, and O Sköldenberg

424

L R Wahlsten, H Palm, G H Gislason, and S Brorson

431

M M Bruin, R L M Deijkers, R Bazuin, E P M Elzakker, and B G Pijls

436

L Chokotho, C A Donnelley, S Young, B C Lau, H-H Wu, N Mkandawire, J-E Gjertsen, Ghallan, K J Agarwal-Harding, and D Shearer

443

A V Yumashev, T V Baltina, and D V Babaskin

448 452

L Daugberg, T Jakobsen, P T Nielsen, M Rasmussen, and A El-Galaly A Kappel, P T Nielsen, and S Kold

455

A Meunier, L Palm, P Aspenberg, and J Schilcher

461

M Stöckell, T Pokka, N Lutz, and J-J Sinikumpu

468

L Musters, L W Diederix, K C Roth, P P Edomskis, G A Kraan, J H Allema, M Reijman, and J W Colaris

Knee flexion contracture impacts functional mobility in children with cerebral palsy with various degree of involvement: a cross-sectional register study of 2,838 individuals Limb lengthening, bone transport Pain, osteolysis, and periosteal reaction are associated with the STRYDE limb lengthening nail: a nationwide cross-sectional study Complications common in motorized intramedullary bone transport for non-infected segmental defects: a retrospective review of 15 patients

472

E H S Pantzar-Castilla, P Wretenberg, and J Riad

479

J D Rölfing , S Kold, T Nygaard, M Mikuzis, M Brix, C Faergemann, M Gottliebsen, M Davidsen, J Petruskevicius, and U K Olesen M Mikužis, O Rahbek, K Christensen, and S Kold

485

Giant cell tumor Pexidartinib improves physical functioning and stiffness in patients with tenosynovial giant cell tumor: results from the ENLIVEN randomized clinical trial

493

M Van De Sande, W D Tap, H L Gelhorn, X Ye, R M Speck, E Palmerini, S Stacchiotti, J Desai, A J Wagner, T Alcindor, K Ganjoo, J Martín-Broto, Q Wang, D Shuster, H Gelderblom, and J H Healey

Erratum Knee flexion contracture impacts functional mobility in children with cerebral palsy with various degree of involvement: a cross-sectional register study of 2,838 individuals (article on page 472 in this issue)

500

E H S Pantzar-Castilla, P Wretenberg, and J Riad

Information to authors (see http://www.actaorthop.org/)

Acta Orthopaedica 2021; 92 (4): 371–372

371

Editorial

“There was no difference (p = 0.079)”

2 kinds of medical scientific publications exist, evidencebased and authority-based. The 1st is based on evidence, systematic observations made in order to establish objective facts and reach reliable conclusions. The 2nd is instead based on the author’s personal experience, knowledge, and understanding. Medical research, historically authority-based but now mainly evidence-based, is typically performed with samples, limited groups of humans or animals, or specimens thereof. The purpose, however, is ultimately to generalize the observations beyond what has been observed, to humans or animals in general. Understanding when an observation can and cannot be claimed as evidence is thus crucial for a successful researcher. One underlying problem is sampling variation, i.e., the characteristics of multiple samples from a population of biologically diverse individuals are known to be heterogeneous, and the heterogeneity means uncertainty when just one single sample is studied. The solution to this problem is to quantify the uncertainty. Statistical inference Statistical inference, especially concepts such as p-values and confidence intervals—both uncertainty measures—therefore plays an important role when presenting research findings. Unfortunately, methodological misconceptions are ubiquitous in medical research. A few examples will be discussed here. Most papers contain a statistics section that includes descriptions like “variables were compared using Student’s t-test“. Such statements are formally incorrect, because statistical tests are not performed to compare observed variables but to test hypotheses concerning the properties of an unobservable population that is represented by the observed sample. The difference may be subtle, bit it is important. The p-value, an uncertainty measure, is the calculated probability of drawing a sample at least as extreme as the observed, given that a specific null hypothesis is true. The confidence interval is another uncertainty measure, which describes the inferential uncertainty of a specific estimate as a range of plausible values. P-value and clinical relevance A tested hypothesis may be clinically relevant, but the p-value itself says nothing about clinical relevance. Nevertheless, many authors believe that p-values represent scientific

importance. This is a common and serious mistake. A finding with p < 0.0001 may well be completely irrelevant. The relevance of a finding must simply be shown by means other than p-values. Furthermore, the clinical importance of a finding can depend on the effect of a studied factor. For example, the minimal clinically important difference (MCID) of VAS pain is usually defined as at least 10 VAS units, and if the effect of a treatment reduces pain by less than that, the treatment effect should be considered clinically irrelevant, even if p < 0.0001. To show that the estimated treatment effect is clinically important, a confidence interval can be used. A clinically relevant treatment effect would be indicated by a confidence interval excluding all effects lower than the MCID. There was no difference (p = 0.079) Numerous published papers report, based on statistical nonsignificance, that studied factors show “no effect”, that compared groups “do not differ”, and that the outcomes of investigated treatments “are not different”. However, statistical nonsignificance does not indicate equivalence but uncertainty, and uncertainty is not evidence. Moreover, a statement such as “there was no difference (p = 0.079)” is a contradictio in adjecto. If there actually were no difference (between the sample’s mean values) a test (of no difference in the population’s mean values) would have produced a p-value of 1.0. The presented p-value therefore shows that there actually was a difference in the sample. The probability that this observed difference is false positive, only existing in the sample, is 7.9%, marginally less unlikely than the 5% traditionally required for statistical significance. The p-value does not say anything about the risk of a false-negative conclusion, i.e., erroneously claiming that no difference exists. This risk may be considerably higher. In addition, a p-value says nothing about a study’s ability to detect clinically relevant differences. Referring to the previous example, an observed clinically relevant reduction in pain VAS of 20 units could well have been accompanied by p > 0.05. This would, with a 5% significance level, not be enough to claim that a clinically relevant treatment effect exists, but it would be a mistake to claim that the treatment had “no effect” on pain. The finding is simply uncertain, and this should be adequately reported.

© 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group, on behalf of the Nordic Orthopedic Federation. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. DOI 10.1080/17453674.2021.1903727

372

A more informative presentation of the result could include the confidence interval of the estimated reduction in pain VAS. This interval would have shown the plausible values of the estimated reduction in pain VAS, say that it was –10 to 50. The result could be interpreted in the following way: In spite of not being able to provide reliable evidence of a beneficial effect, the investigation indicates that a potential effect is unlikely to be worse than a pain increase of 10 units or better than a pain reduction of 50 units. The Table 1 fallacy As a further description of common p-value misunderstandings, a Table 1 fallacy can be considered. The table usually describes baseline values after randomization in randomized trials and characteristics at start of follow-up in observational studies. Many authors include p-values in these tables. Why? 2 arguments are often given: (1) to enable an evaluation of the success of randomization in randomized trials, and (2) to show what variables need to be adjusted for in observational studies.

Acta Orthopaedica 2021; 92 (4): 371–372

P-values are, however, irrelevant for both purposes. The purpose of randomization is not to generate similar groups but to prevent systematic errors, and confounding adjustment is about validity (bias), not precision (p-values). Summary In summary, the current practice of (i) presenting research findings as “significant” without specifying whether this refers to practical importance (statistical significance) or to inferential uncertainty (clinical significance), (ii) presenting p-values as descriptive measures of practical importance, and (iii) claiming that statistical non-significance provides evidence of equivalence should be condemned. It demonstrates ignorance and an unsound inclination to replace scientific reasoning with p-values. In spite of all presented p-values, the actual content is not better than a subjective opinion. Good research provides objective evidence. Jonas Ranstam Statistical Editor email: jonas.ranstam@med.lu.se

Acta Orthopaedica 2021; 92 (4): 373–375

373

Annotation

Survey of the specializing doctor training in orthopedics and traumatology across university hospitals in Finland Sofianna OJALA 1, Heikki KRÖGER 1, Juhana LEPPILAHTI 2, Juha PALONEVA 3, and Joonas SIROLA 1 1 Kuopio

Musculoskeletal Research Unit (KMRU), University of Eastern Finland, and Department of Orthopedics, Traumatology and Hand Surgery, Kuopio University Hospital, Kuopio; 2 Department of Surgery, Division of Orthopedic and Trauma Surgery, Oulu University Hospital, Oulu, Finland/Medical Research Center, University of Oulu; 3 Central Finland Hospital, Jyväskylä, Finland, and University of Eastern Finland Correspondence: sofio@uef.fi Submitted 2020-12-03. Accepted 2021-03-04.

This annotation describes the results from national audit of the orthopedics and traumatology specialization program and specializing physicians’ skills across all 5 university hospitals in Finland (Helsinki University Hospital, HUH; Kuopio University Hospital, KUH; Tampere University Hospital, TAYS; Turku University Hospital, TYKS; and Oulu University Hospital, OYS). Competency-based training in surgical specialties is gathering more interest worldwide (Nousiainen et al. 2018, Gustafsson et al. 2019, LaPorte et al. 2019). In Finland, at the end of 2018, a reform of specializing physician training and also of the whole specialist training in surgery was launched, aiming at taking steps towards competency-based education (Paananen 2017, Seppänen 2018). The previous specialization curriculum was time dependent, taking 6 years of surgical training at minimum. It included 9 months in primary healthcare service, a minimum of 2 years and 3 months of general surgical training at a central hospital, after which more focused specialty training (such as orthopedics and traumatology) took place in a university hospital (3 years or more). The competency of a consultant orthopedic surgeon was then granted after finalizing the national specialization exam, consisting of 5 freely formulated questions concerning orthopedics and traumatology. As a result of the reform in Finland, the narrower specialty of surgery must currently already be decided at the application phase. The actual specialization includes a 12-month surgical orientation period in various areas of surgery, followed by a 6-month trial period in a narrower specialty, such as orthopedics and traumatology. After the trial period, there is 9–12 months of competency-based general education in a narrower specialty and then a 3-year differentiating phase. At least 1 year of train-

ing must be completed at a university hospital and at least 1 year at a central hospital. Therefore, differences may be found in Finnish specialist training in comparison with other countries (reviewed earlier). For an example, in the United Kingdom (UK), Trauma & Orthopedic surgery training initially includes a 2-year Foundation Training in different specialties of medicine, and after that doctors apply for a Core Surgical Training (CST) program for the next 2 years. CST includes 4- to 6-month periods in different areas of surgery. After CST, junior surgeons apply for a Specialist Surgical Training program, which typically lasts 6 years and ends with a specialty exit exam. After passing the exam a Certificate of Completion of Training is received (BOTA Collaborations and Rashid 2018). Compared with other European countries in addition to the UK, France and a few other countries do not have any mandatory course training, in comparison with Finland where 80 hours is required. On the other hand, in Croatia and Denmark the requirement is over 300 hours of course training. The highest minimum numbers of required surgical procedures are in the UK and Ireland—1800 procedures—whereas in Finland there are no specified requirements. In Finland there is a final written exam, but for example in Sweden there is no exam at all (Madanat et al. 2017). The present annotation provides extensive information on the different areas of specialization training in orthopedics and traumatology. Electronic survey The electronic audit questionnaire (Supplement 1) was compiled for specializing physicians (registrars) in orthopedics and traumatology using the SurveyMonkey tool. The questionnaire was sent by e-mail link to all specializing physicians (n = 61) at the time of the audit, i.e., April to June 2019. All

374

of these specializing physicians had completed the common trunk of their surgical training at the time of the audit. They are also part of the old system of specialist training before the reform at the end of 2018, when specialist training was time dependent, taking 6 years minimum, and general surgical training in various fields of surgery was 15 months in duration. Since the reform at the end of 2018, specialist training includes a 12-month orientation period in various fields of surgery and after that is focused only on the narrower specialty, for example orthopedics and traumatology, and is more competency-based than time dependent. The questionnaire included around 100 questions regarding surgical skills and education, clinical and scientific work, and other aspects of specializing physician training. The data was pseudonymized and the respondents gave permission to use the answers for research purposes. The audit included 2 questions on the amount of and competency in orthopedic and traumatological procedures performed. These numbers were a subjective estimate made by the specializing physicians themselves. 9 respondents gave indefinite, non-numerical answers and were eliminated. 14 respondents gave answers such as “100–200” or “100+,” in which case we considered the mean of the range as the definite answer or the lowest reported number. Educational views (Supplement 2) 36 (mean age 35 years, 23 male) of 61 submitted surveys were answered. 3 respondents answered only the first question and were eliminated from the analyses. 22 respondents considered job description to be the most important factor when choosing a future job. Interestingly, all respondents intend to work as an orthopedist in a public hospital or facility in the future after the specialization program rather than the private sector. 26 respondents consider that university hospitals have a good or very good opportunities for accessing leadership training. However, 10 respondents consider the opportunities to be poor or very poor. According to the respondents, leadership training is offered for an average of 0–30 credits and is free of charge. Almost all (32) respondents have calendar time set aside for meeting-type training (approximately 3 hours per week). However, no working time is set aside for preparation of meeting presentations. Surgical skills training (Supplement 3) When considering the traumatological procedures done by specializing physicians, all respondents have operated on a hip fracture with a trochanteric nail, operated on an ankle fracture, and 33 respondents have done a plate fixation of a wrist fracture independently in some way. In contrast, one-third have operated on a proximal humerus fracture and one-fifth have operated a vertebral fracture independently. When considering the orthopedic procedures performed by specializing physicians independently, none of the respon-

Acta Orthopaedica 2021; 92 (4): 373–375

dents have operated on a knee cruciate ligament or collateral ligament with a graft and only 1 has done medial patellofemoral ligament reconstruction independently. One-fourth have done shoulder decompression independently and 5 respondents have operated on a rotator cuff rupture. In contrast, 34 respondents have removed osteosynthesis material independently and 33 have done carpal canal release independently. Synthesis of the survey In this study, we audited the content of the specialist training program in Finland before the reform at the end of 2018. In this way, it will be possible to evaluate the success of renewed training in the future by implementing the survey again after 3–4 years. Most likely there will be changes in the duration of the specialization. Also, the number of independently performed surgical procedures may increase as the narrower specialty of surgery is already decided in the application phase and because the training is more competency oriented. According to the present audit, all of the respondents intend to work as a specialist at a public hospital or facility in the future and none of the respondents are considering working in the private sector. In many countries, it is common to enter a fellowship after specialization in orthopedics and traumatology. This is not the case in Finland, and the interest in working in the public sector might be due to fact that the respondents want to gather more experience after graduation before working in the private sector. In Finland, specialization in orthopedics and traumatology does not officially include a working period in the private sector. Accordingly, this may influence reluctance to consider a private hospital as a future employer. Specializing physicians gave a self-estimated number of how many independently performed procedures they have done already. A common logbook at the national level is paramount to obtain more exact information on the true number of procedures. At present, steps at the national level have been taken to introduce such a uniform logbook. Recent evidence favors a nonoperative treatment line for several orthopedic conditions. As an example, the number of independently performed surgeries on proximal humerus fractures was quite low, which may reflect treatment policies. Also, the number of arthroscopic procedures was low, reflecting recent evidence. The overall response rate was modest. Two-thirds of the specializing physicians in Finland responded to the survey, but this sample can be considered quite representative as all university hospitals were included. This audit did not include a section on pediatric orthopedics. Pediatric orthopedics is a subspecialty in Finland and is not provided in all university hospitals due to lack of resources. The purpose in this annotation was to audit basic training in orthopedics and traumatology provided at all university hospitals. In conclusion, according to our survey of the orthopedic specialization in Finland, the number of key orthopedic procedures was found to be quite high. The survey also provides

Acta Orthopaedica 2021; 92 (4): 373–375

widespread information on the general training conditions of specializing physicians in orthopedics and traumatology in Finland. In the future, auditing will be easy to extend to other areas of medical specialization too. The information can be used directly to develop the structure and content of specialist training. In the first instance, the procedures should be taught according to evidence-based medicine. According to the results of the questionnaire, the amount of arthroscopy training should be increased. Also, new audits in other countries can be compared to further develop specializing-doctor training. The effect of the renewal on specialization training remains to be seen after follow-up audits. Arthroscopy training may be improved by modern VR (virtual reality) based simulators. Also, other VR surgical training is evolving and may substantially change the training in widespread areas of orthopedics and traumatology. Funding and potential conflicts of interest This annotation did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. The authors report no declarations of interest. Full results of the survey are available from the authors upon reasonable request. Supplementary data Supplements 1–3 are available in the online version of this article, http://dx.doi.org/10.1080/17453674.2021.1910772

375

The authors would like to thank Prof Ville Mattila, Prof Ilkka Kiviranta, Prof Hannu Aro, Prof Teppo Järvinen, Mikko Heinänen (MD), and the Finnish Orthopaedic Association for their participation in the study. Acta thanks Anne Garland and Rami Madanat for help with peer review of this study.

BOTA Collaborators, Rashid M S. An audit of clinical training exposure amongst junior doctors working in trauma & orthopaedic surgery in 101 hospitals in the United Kingdom. BMC Med Educ 2018; 18(1): 1. Gustafsson A, Viberg B, Paltved C, Palm H, Konge L, Nayahangan L J. Identifying technical procedures in orthopaedic surgery and traumatology that should be integrated in a simulation-based curriculum: a national general needs assessment in Denmark. J Bone Joint Surg Am 2019; 101(20): e108. doi: 10.2106/JBJS.18.01122. LaPorte D M, Tornetta P, Marsh J L. Challenges to orthopaedic resident education. J Am Acad Orthop Surg 2019; 27(12): 419-25. doi: 10.5435/ JAAOS-D-18-00084. Madanat R, Mäkinen T J, Ryan D, Huri G, Paschos N, Vide J; FORTE writing committee. The current state of orthopaedic residency in 18 European countries. Int Orthop 2017; 41(4): 681-7. doi: 10.1007/s00264-0173427-0. Nousiainen M T, Mironova P, Hynes M, Takahashi S G, Reznick R, Kraemer W, Alman B, Ferguson P, the CBC Planning Committee. Eight-year outcomes of a competency-based residency training program in orthopedic surgery. Med Teach 2018; 10: 1042-54. doi: 10.1080/0142159X.2017. Paananen P. Erikoistumisuudistuksen ihanuus ja kurjuus. Lääkärilehti; 2017 [in Finnish]. Seppänen A. Uusi hakumenettely erikoislääkärikoulutukseen tekee kriteerit läpinäkyviksi. Lääkärilehti; 2018 [in Finnish].

376

Acta Orthopaedica 2021; 92 (4): 376–380

Impact of the COVID-19 pandemic on emergency and elective hip surgeries in Norway Karin MAGNUSSON 1,2, Jon HELGELAND 1, Mari GRØSLAND 1, and Kjetil TELLE 1 1 Norwegian

Institute of Public Health, Cluster for Health Services Research, Oslo, Norway; 2 Lund University, Faculty of Medicine, Department of Clinical Sciences Lund, Orthopaedics, Clinical Epidemiology Unit, Lund, Sweden Correspondence: karin.magnusson@fhi.no Submitted 2020-10-02. Accepted 2021-02-04.

Background and purpose — Many countries implemented strict lockdown policies to control the COVID-19 pandemic during March 2020. The impacts of lockdown policies on joint surgeries are unknown. Therefore, we assessed the effects of COVID-19 pandemic lockdown restrictions on the number of emergency and elective hip joint surgeries, and explored whether these procedures are more/less affected by lockdown restrictions than other hospital care. Patients and methods — In 1,344,355 persons aged ≥ 35 years in the Norwegian emergency preparedness (BEREDT C19) register, we studied the daily number of persons having (1) emergency surgeries due to hip fractures, and (2) electively planned surgeries due to hip osteoarthritis before and after COVID-19 lockdown restrictions were implemented nationally on March 13, 2020, for different age and sex groups. Incidence rate ratios (IRR) reflect the after-lockdown number of surgeries divided by the beforelockdown number of surgeries. Results — After-lockdown elective hip surgeries comprised one-third the number of before-lockdown (IRR ~0.3), which is a greater drop than that seen in all-cause elective hospital care (IRR ~0.6). Men aged 35–69 had half the number of emergency hip fracture surgeries (IRR ~0.6), whereas women aged ≥ 70 had the same number of emergency hip fracture surgeries after lockdown (IRR ~1). Only women aged 35–69 and men aged ≥ 70 had emergency hip fracture surgery rates after lockdown comparable to what may be expected based on analyses of all-cause acute care (IRR ~0.80) Interpretation — It is important to note for future pandemics management that lockdown restrictions may impact more on scheduled joint surgery than other scheduled hospital care. Lockdown may also impact the number of emergency joint surgeries for men aged ≥ 35 but not those for women aged ≥ 70.

Because of COVID-19, Norway implemented early one of the strictest lockdown policies of all countries. The lockdown measures are believed to have limited the spread of the virus in this country dramatically, but on the other hand, may have had several unknown negative side effects on the planned and acute care for vulnerable groups. As an example, people with osteoporotic fractures and osteoarthritis are often elderly and fragile, with a high need for care to prevent long-term disability and death. The conditions are often managed by the most commonly performed surgical joint procedure worldwide: total hip arthroplasty (Learmonth et al. 2007). Whereas acute hip fracture surgeries should be performed within 24 hours according to national guidelines (NOF 2018), surgeries due to hip osteoarthritis are typically planned weeks or months in advance (Zhang et al. 2008). The impact of the COVID-19 lockdown restrictions on such typically acute and elective care for age-related conditions that require hospitalization is currently unknown, but can be hypothesized to be major, at least for elective care. Also, if there is an effect on acute care, knowledge of which population groups to a lesser extent need, or make use of, acute care is important in the future handling of pandemics. Such analyses may also provide knowledge for future natural experiments evaluating whether any care is in fact unnecessary in a longterm perspective (Moynihan et al. 2020). Thus, we assessed the effects of the COVID-19 pandemic lockdown restrictions on acute and elective inpatient care in Norway during spring 2020, using surgeries for hip fractures and surgeries for hip osteoarthritis as examples.

Methods We utilized data from the BEREDT C19 register, which is a newly developed emergency preparedness register aiming

Acta Orthopaedica 2021; 92 (4): 376–380

to provide rapid knowledge of the spread of the COVID-19 virus and how spread as well as measures to limit spread affect the population’s health, use of healthcare services, and health-related behaviors (Norwegian Institute of Public Health 2020). The register currently consists of electronic patient records from all hospitals in Norway (NPR), and data from the Norwegian Surveillance System for Communicable Diseases (MSIS) and the Norwegian Intensive Care and Pandemics Register (NIPaR), which are merged on the personal identification number and updated daily, with a range of other registry linkages currently ongoing. The register covers all data from hospitals (inpatient, outpatient, and day-care), with complete diagnostic and procedure codes from January 1, 2020 until the pandemic is over and has been evaluated. In the current study, our population included everyone in Norway registered with acute or elective inpatient care and we restricted our sample to the age groups to which diagnoses of hip osteoarthritis and fractures apply (age 35 or more). Outcomes Besides studying all registered inpatient care coded with emergency grades acute vs. elective (any cause), we studied the number of patients hospitalized with the outcomes (1) emergency surgeries due to hip fractures, and (2) electively planned surgeries due to hip osteoarthritis. Hip fracture surgeries were identified as having ICD-10 codes S72* (main diagnosis or other diagnosis) in combination with NCSP procedure code NFJ* and/or NFB*, and an emergency grade coded as acute. Hip osteoarthritis surgeries were identified as having ICD-10 codes M16* (main diagnosis or other diagnosis) and procedure code NCSP NFB* and an emergency grade coded as elective. Statistics We assessed the described outcomes prior to and after lockdown restrictions were implemented in Norway on March 13, 2020, i.e., in the period January to May 2020. We first studied all-cause acute and elective inpatient care using a Poisson regression model with the daily number of hospitalizations as outcome (i.e., acute and elective emergency grades in separate analyses) and time as explanatory variable (allowing for overdispersion in negative binomial models had negligible impacts on results). Thus, we categorized dates in 5 x 2-week periods before March 13, 2020, and 5 x 2-week periods after March 13, 2020, covering a total time period from January 3, 2020 to May 21, 2020. To observe trends in number of surgeries over time, we compared the incidence rate ratios (IRR) for all the 2-week periods with the base level, which was defined as the two first weeks of January, starting from January 3. We also compared IRR in the 10 weeks before and after lockdown restrictions were implemented nationally in Norway on March 13, 2020, using the period of 10 weeks before lockdown as base level (Jannuary 3–March 12). The IRR should be interpreted as the estimated number of surgeries for the given

377

Incidence rate ratios of daily hospitalizations 1.0

0.8

0.6

0.4

Acute care Elective care 3–16 17–30 31–13 14–27 28–12 13–26 27–9 10–23 24–7 8–21 Jan. Jan. J./F. Feb. F./M. March M./A. April A./M. May

Figure 1. Incidence rate ratios of daily hospitalizations (any cause) with acute and elective emergency grade in Norway, January 3, 2020–May 21, 2020, with January 3–January 16 as base level, with 95% confidence intervals. Red dots/line = acute care. Blue dots/line = elective care. The red and blue 2-weekly dots are graphed next to each other for improved readability. Vertical line represents the national implementation of lockdown strategies on March 13, 2020.

period divided by the estimated number of surgeries in the base level period. We then repeated these analyses for emergency hip fracture surgeries and elective hip osteoarthritis surgeries. Also, to explore whether any hip surgery patient groups may be more affected by lockdown restrictions in terms of their healthcare use than would be expected from our analyses of all-cause acute and elective hospitalizations, we stratified the analyses on age (35–69 years vs. 70 and above), for men and women separately. Analyses were adjusted for weekends and holidays. Finally, we predicted number of unperformed surgeries in the period after March 13, 2020 from the same Poisson regression model, conditional on weekdays. We used Stata version 16.1 (StataCorp, College Station, TX, USA) for all analyses. Ethics, funding, and potential conflicts of interest Institutional board review was conducted, and the Ethics Committee of South-East Norway confirmed (June 4, 2020, #153204) that external ethical board review was not required. The study was funded by the Norwegian Institute of Public Health. No external funding was received, and the authors declare no conflicts of interests.

Results BEREDT C19 comprised 1,344,355 individuals with at least one contact with specialist care from January 3 2020 to May 21 2020. Of those aged 35 or more having > 24-hour hospitalizations (inpatients), we observed 73,091 emergency and 187,714 elective hospitalizations (persons could be counted in both groups). Persons receiving elective inpatient care had half the rate of care after lockdown (n = 26,360 hospitalizations) compared with before lockdown (n = 46,731 hospitalizations) (IRR 0.56, 95% confidence interval [CI] 0.56–0.57 as compared with base levels before lockdown, IRR 1) (Figure 1).

378

Acta Orthopaedica 2021; 92 (4): 376–380

IRR (95%CI) — men 35–69 years

IRR (95%CI) — women 35–69 years

IRR (95%CI) — men ≥70 years

IRR (95%CI) — women ≥70 years

2.0 1.5 1.0

2.0 1.5

2.0 1.5 1.0

1.0

0.5

Emergency hip surgeries due to fracture Elective hip surgeries due to osteoarthritis 3–16 17–30 31–13 14–27 28–12 13–26 27–9 10–23 24–7 8–21 Jan. Jan. J./F. Feb. F./M. March M./A. April A./M. May

Figure 2. Incidence rate ratios (y-axis, IRR) of emergency hip surgeries due to fracture (red) and the IRR of elective hip surgeries due to osteoarthritis (blue) in Norway, January 3, 2020–May 21, 2020 with January 3–January 16 as base level with 95% confidence intervals (CI). Red dots/line = emergency hip fracture surgeries. Blue dots/line = elective hip osteoarthritis surgeries. The red and blue 2-weekly dots are graphed next to each other for improved readability. Vertical line represents the national implementation of lockdown strategies on March 13, 2020.

Table 1. Hip surgeries 10 weeks after compared with 10 weeks before (base level) the implementation of national lockdown on March 13, 2020. All events represent > 24-hour hospitalizations Factor All-cause elective care IRR (CI) after lockdown vs. before Elective hip surgery (osteoarthritis) No. of surgeries before lockdown No. of surgeries after lockdown IRR (CI) after lockdown vs. before Estimated no. (CI) of cancelled surgeries after lockdown vs. before All-cause acute care IRR (CI) after lockdown vs. before Emergency hip surgery (fractures) No. of surgeries before lockdown No. of surgeries after lockdown IRR (CI) after lockdown vs. before Estimated no. (CI) of avoided hip fractures after lockdown vs. before

Men Men Women Women 35–69 years ≥ 70 years 35–69 years ≥ 70 years 0.58 (0.58–0.58)

0.60 (0.60–0.60)

0.55 (0.54–0.55)

0.53 (0.52–0.53)

394 105 0.27 (0.26–0.27)

349 115 0.33 (0.32–0.34)

580 193 0.33 (0.33–0.34)

736 229 0.31 (0.31–0.32)

289 (284–294)

234 (229–239)

387 (380–393)

507 (499–514)

0.83 (0.83–0.83)

0.81 (0.81–0.81)

0.79 (0.79–0.80)

189 107 0.56 (0.55–0.58)

539 426 0.79 (0.79–0.80)

163 133 0.82 (0.79–0.84)

1,039 1,054 1.01 (1.00–1.02)

82 (78–86)

113 (106–120)

30 (25–34)

15 (4–25) a

a Estimated

increase. CI: 95% confidence interval. IRR: incidence rate ratio.

Also, acute care occurred at a 20% lower rate after lockdown (n = 83,838 hospitalizations) than before (n = 103,875 hospitalizations) (IRR 0.81, CI 0.80–0.81) (Figure 1). We observed 2,701 new hip surgeries due to osteoarthritis and 3,650 new hip surgeries due to fractures throughout the study period, i.e., in persons without prior prosthesis surgery in any of the hip joints. Before March 13, 2020, the daily number of new hip surgeries corresponded to that reported for the same time period in previous years (Helse Bergen 2014). The rate of elective hip surgeries due to osteoarthritis dropped substantially after lockdown restrictions were implemented on March 13, 2020, with similar observations across age and sex groups (Figure 2). For men aged 35–69, there was also a slight decrease in the rate of emergency hip surgeries due to fractures (Figure 2).

When compared with what may be expected, based on the average after-lockdown drop in all-cause elective and allcause acute hospitalizations, we observed large deviations for our musculoskeletal outcomes. For elective hip osteoarthritis surgeries, IRRs were half that observed for elective all-cause hospitalization, for all age and sex strata (IRR ~0.3 for elective hip osteoarthritis surgeries vs. IRR ~0.6 for all-cause elective care) (Table). This would imply that around 1,400 planned hip surgeries in Norway have not been performed due to the COVID-19 pandemic and would need to be treated elsewhere or scheduled for surgery on another date (estimated number of unperformed hip osteoarthritis surgeries after March 13, 2020 = 1,417, CI 1,392–1,440). For emergency hip fracture surgeries there were large variations by age and sex. Men aged 35–69 had a lower rate of

Acta Orthopaedica 2021; 92 (4): 376–380

emergency hip fracture surgery than of all-cause acute care (IRR ~0.6 vs. ~0.8), whereas women aged 70 or more had a higher such rate (IRR ~0.8 vs. ~1.0) (see Table). In contrast, women aged 35–69 and men aged ≥ 70 had hip fracture surgery rates comparable to what may be expected based on analyses of all-cause acute care (IRR ~0.80) (see Table). Altogether, lockdown restrictions may have given a reduction of ~200 acute events requiring immediate hip fracture surgery (estimated number of avoided hip fractures after March 13, 2020 = 210, CI 184–236).

Discussion In this study based on data from the BEREDT C19—the Norwegian emergency preparedness register—we report a sudden and steep decrease in the daily number of planned hip joint surgeries, beginning on the day after lockdown restrictions were implemented in Norway on March 13, 2020. This decrease was greater than the decrease in other (all-cause) elective inpatient care for all age and sex groups. Interestingly, we also report a consistent decrease in all-cause acute inpatient care that was found only partly in age- and sex-specific analyses of emergency hip joint fracture surgeries; lockdown restrictions may impact on the number of acute joint surgeries for middle-aged and elderly men (aged ≥ 35) as well as for middle-aged women (age 35–69), but not for elderly women (aged ≥ 70). The observed decrease of elective care including hip surgeries is not surprising, and sheds new light on a recent report on effects of lockdowns on elective care globally (COVIDSurg Collaborative, 2020). Here, we additionally show that the activity of elective surgeries was reduced more than other elective activities in inpatient care, and that it increased rather quickly again, as authorities gained control over the spread of the pandemic during April 2020. However, surgery rates were not back to normal (437 elective hip surgeries per 14-days periods during 2019 (Nasjonal kompetansetjeneste for leddproteser og hoftebrudd 2019)) by the end of May 2020 and around 1,400 elective hip osteoarthritis surgeries would have to be rescheduled or treated nonoperatively in primary care. Figure 2 shows that the activity recovered approximately equally for the different age and sex groups, although there may be minor variations. For emergency hip surgeries due to fractures, we observed a somewhat unexpected decrease in incidence that was more evident for men than for women. Hospitals were not instructed to limit access to acute care, so the observed decrease may be explained by the fact that people stayed at home/inside more, which reduced the risk of falls and subsequent hip fractures. If so, the non-decreasing incidence of hip fractures in women may be explained by female hip fractures more frequently being a result of intrinsic causes like bone mineral density (Emaus et al. 2009). Whereas all age and sex groups had fewer

379

hip fracture surgeries after lockdown, women aged ≥ 70 had a slightly increased hip fracture surgery rate, with an additional 4–25 surgeries occurring in the 10 weeks after lockdown compared with the 10 weeks before lockdown. Our findings may have implications for the handling of future new outbreaks of COVID-19. First, our data show that when elective healthcare and other parts of society are locked down by the authorities, the instructions are followed by the hospitals and the use of elective care decreases to a similar magnitude for men and women, young and elderly. Using hip osteoarthritis surgery as an example, we also show that elective joint surgery rates decrease more than other elective inpatient care after lockdown restrictions are implemented. Second, the lockdown restrictions likely also impacted on the need for/use of acute health are, and did so to a different extent for men and women when exemplified using emergency hip fracture surgeries. Thus, our study implies that when policymakers consider lockdown of certain elective hospital activities during future outbreaks, it might affect people with musculoskeletal pain, and in particular men, disproportionally severely. Considering that chronic pain may lead to permanent work disability, policymakers may want to be more careful in locking down healthcare services for this large group of persons in the future. We suggest the future effects of unperformed hip surgeries due to lockdown as a topic for future study. Also, our findings suggest that different population groups have different levels of anxiety in seeking healthcare when a pandemic is present, i.e., people may be afraid of seeking healthcare because of risk of infection. We suggest future studies to further explore the causes for the different reductions in acute hip surgeries in elderly and young men and women. Some important limitations should be mentioned. First, we could not study whether the effect of lockdown restrictions is causal. For example, it is possible that the decrease in emergency hip fracture surgeries in men aged 35–69 is partly due to seasonal variations. However, we note that the number of surgeries prior to March 13, 2020 was similar to that reported for previous years (Helse Bergen 2014). Also, our findings apply only to Norwegian conditions and countries with similar healthcare services, healthcare organization, and demography to Norway. Future studies should explore the effects of lockdown restrictions on healthcare use comparing different countries’ lockdown strategies. A second limitation may be that we could not distinguish between experiencing joint pain and/ or an acute event and seeking healthcare. Thus, as described above, there may be age and sex differences in care-seeking behavior that we could not account for here. Finally, there may be several potential competing risks in our sample. For example, persons hospitalized for cancer treatment may be unlikely to experience a hip fracture because they are indoors more. However, our goal was not to study disease etiology; rather, we give an overview of potential impacts individuals in need of hip joint surgery may experience as a result of lockdown restrictions.

380

In conclusion, we show that the lockdown restrictions implemented in Norway due to the spread of the COVID-19 pandemic reduced the use of elective inpatient care, but also acute inpatient care. In particular, we report that men and midlife age groups had a lower rate of emergency hip fracture surgeries after than before lockdown. We believe it is important to report these findings for improved knowledge, allowing for optimal management of future pandemics on a similar or larger scale.

The authors would like to thank the Norwegian Directorate of Health, in particular Director for Health Registries Olav Isak Sjøflot and his department, for excellent cooperation in establishing the emergency preparedness register. They would also like to thank Gutorm Høgåsen, Anja Lindman, and Ragnhild Tønnessen for their invaluable efforts in the work on the register. The interpretation and reporting of the data are the sole responsibility of the authors, and no endorsement by the register is intended or should be inferred. The authors would also like to thank everyone at the Norwegian Institute of Public Health who has been part of the outbreak investigation and response team. KM had access to all of the data in the study and takes full responsibility for the integrity of the data and the accuracy of the data analysis and statistical analyses performed, and also drafted the manuscript. JH, MG, and KT contributed with acquisition of data, conceptual design, analyses, and interpretation of results. All authors contributed to drafting the article or critically revising.

Acta Orthopaedica 2021; 92 (4): 376–380

Acta thanks Yan Li and Jan Duedal Rölfing for help with peer review of this study. COVIDSurg Collaborative. Elective surgery cancellations due to the COVID-19 pandemic: global predictive modelling to inform surgical recovery plans. Br J Surg 2020; 107(11): 1440-9. Emaus N, Omsland T K, Ahmed L A, Grimnes G, Sneve M, Berntsen G K. Bone mineral density at the hip in Norwegian women and men—prevalence of osteoporosis depends on chosen references: the Tromsø Study. Eur J Epidemiol 2009; 24(6): 321-8. Helse Bergen H F. Ortopedisk klinikk, Haukeland universitetssykehus. Nasjonalt register for leddproteser. Rapport; juni 2014. Moynihan R, Johansson M, Maybee A, Lang E, Légaré F. COVID-19: an opportunity to reduce unnecessary healthcare. BMJ 2020; 370: m2752. Learmonth I D, Young C, Rorabeck C. The operation of the century: total hip replacement. Lancet 2007; 370(9597): 1508-19. Nasjonal kompetansetjeneste for leddproteser og hoftebrudd. Annual report 2019, p. 33. http://nrlweb.ihelse.net/Rapporter/Rapport2020.pdf NOF—Norsk Ortopedisk Forening. Norske retningslinjer for tverrfaglig behandling av hoftebrudd; 2018. Norwegian Institute of Public Health. Beredskapsregisteret for COVID-19; 2020. Available at: https://www.fhi.no/sv/smittsomme-sykdommer/corona/ norsk-beredskapsregister-for-COVID-19/ Zhang W, Moskowitz R W, Nuki G, Abramson S, Altman R D, Arden N, Bierma-Zeinstra S, Brandt K D, Croft P, Doherty M, Dougados M, Hochberg M, Hunter D J, Kwoh K, Lohmander L S, Tugwell P. OARSI recommendations for the management of hip and knee osteoarthritis, Part II: OARSI evidence-based, expert consensus guidelines. Osteoarthritis Cartilage 2008; 16(2): 137-62.

Acta Orthopaedica 2021; 92 (4): 381–384

381

Does the Covid-19 pandemic affect ankle fracture incidence? Moderate decrease in Sweden Emilia Möller RYDBERG 1,2, Michael MÖLLER 1,2, Jan EKELUND 3, Olof WOLF 4,5, and David WENNERGREN 1,2 1 Institute

of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg; 2 Department of Orthopaedics, Sahlgrenska University Hospital, Gothenburg/Mölndal; 3 Centre of Registers Västra Götaland; 4 Section of Orthopaedics, Department of Surgical Sciences, Uppsala University, Uppsala; 5 Department of Orthopaedics, Uppsala University Hospital, Uppsala, Sweden Correspondence: emilia.rydberg@vgregion.se Submitted 2020-12-23. Accepted 2021-03-02.

Background and purpose — While many other countries implemented strict regulations and restrictions for their citizens during the 1st wave of the Covid-19 pandemic, Sweden maintained a more restrained approach. The Swedish Public Health Agency emphasized individual responsibility and pushed for behavioral changes. With strict lockdown a 77% decrease in ankle fracture incidence has been reported. We investigated whether there was a change in the incidence of ankle fractures seen at 7 selected hospitals during the Covid19 pandemic 2020. Patients and methods — Data on all ankle fractures treated at 7 selected departments during March 15 through June 15, 2020, and for the same period in the preceding 3 years (2017–2019), was retrieved from the Swedish Fracture Register. The number of fractures during the whole period and subsequent 30-day periods were compared between 2020 and 2017–2019, including subgroup analyses of age and sex. Results — The monthly rate of ankle fractures was reduced by 14% in 2020 (139 fractures) compared with 2017–2019 (161 fractures). Women had a 16% decrease and patients aged > 70 years had a 29% decrease. During the 1st 30-day period, a 26% decrease in fractures was seen. Interpretation — During the 1st wave of the Covid-19 pandemic, a moderate decline in the number of ankle fractures was seen. Women and patients aged > 70 years displayed the greatest reduction. The greatest reduction in incidence of fractures was seen during the 1st 30-day period. This indicates greater adherence to government recommendations regarding social distancing in these subgroups and during the 1st month of the pandemic. Changes in ankle fracture incidence may be a measure of lockdown extent.

While many other countries implemented strict regulations and restrictions for their citizens during the 1st wave of the Covid-19 pandemic, imposing quarantine and entry bans and closing restaurants, schools, and preschools to prevent the spread, Sweden maintained a more restrained approach. The Public Health Agency of Sweden (SPHA) emphasized individual responsibility and pushed for behavioral changes instead of imposing regulations, with the aim of reducing the pace of viral transmission (Public Health Agency of Sweden 2020). From mid-March, 2020, Swedes were advised to stay at home if they had symptoms, to keep their distance and take personal responsibility (Government Offices of Sweden 2020). The only ban imposed in Sweden during the 1st months of the pandemic during the spring of 2020 was a ban on public gatherings of 50 or more people on March 27. Ankle fractures are common in all age groups and both sexes and most often sustained after a simple same-level fall (Bergh et al. 2020, Rydberg et al. 2020). Following the recommendations imposed by the SPHA, people worked more from home, thereby eliminating travelling to work. Sports activities and competitions were cancelled on a widespread scale. A reduction in the number of ankle fractures could be expected. Haskel et al. (2020) reported a 77% reduction in ankle fractures in New York and Lubbe et al. (2020) reported a reduction in the number of orthopedic trauma and foot injuries in Las Vegas during a 45-day period in March–April. Kuitunen et al. (2020) described a substantial decrease in Finland in the number of visits to A&E due to orthopedic conditions during the 1st wave of the pandemic. The Swedish Fracture Register (SFR) is a national quality register that has been collecting information on fractures of all types for almost 10 years (Wennergren et al. 2015). The SFR

382

collects data on patient characteristics, injury mechanism, fracture type, and subsequent treatment methods for fractures of all types, treated surgically as well as non-surgically. As data are entered in the register by the treating physician at the time of the injury, they can be extracted from the register without delay. We investigated whether there was a change in the incidence of ankle fractures seen in some selected hospitals during the Covid-19 pandemic compared with the same time period during the previous 3 years. We also investigated whether a reduction in the incidence of ankle fractures was seen in subgroups defined by age and sex, or during certain 30-day periods.

Patients and methods The study is based on data from the Swedish Fracture Register (SFR). Data is entered in the SFR by the responsible physician, usually at the time of presentation at the accident and emergency department. The register has almost 100% coverage, i.e., almost all departments in Sweden treating fractures are affiliated with the SFR. The completeness of fracture registration in the SFR, i.e., the number of fractures treated in each department that is registered in the SFR, is evaluated annually and several studies have evaluated the validity of fracture classification in the SFR (Juto et al. 2016, Wennergren et al. 2017, Swedish Fracture Register 2020). This is an observational study based on data on ankle fractures, in patients aged 16 years and above, treated at a sample of orthopedic departments with a history of high completeness in their registrations in the SFR. The orthopedic departments at the hospitals in Varberg, Uddevalla/Trollhättan, Göteborg, Borås, Falun, Gävle, and Östersund have all had a completeness in their registrations of 70% or more in the past 3 years (2016–2018) and were therefore included in the study (Swedish Fracture Register 2018, 2019). The same departments also have a history of rapid fracture entry in the register (registering a substantial number of fractures within 30 days of the injury) (Swedish Fracture Register 2020). Data was extracted from the SFR on all ankle fractures treated at the 7 departments listed above during March 15 through June 15, 2020, and the same period in the preceding 3 years (2017–2019). Statistics In order to minimize the risk of overinterpreting differences in the numbers of fractures for individual years, the observed time period for 2020 was compared with the mean for the corresponding time periods in 2017–2019. Comparisons were made for the total number of fractures during the observed time period as a whole and for the 3 x 30–day periods (March 15–April 14, April 15– May 14, and May 15–June 15) respectively. Subgroup analyses included sex and age groups. Descriptive statistics are presented as means (SD), medians (range), and proportions.

Acta Orthopaedica 2021; 92 (4): 381–384

Table 1. Demographic data on ankle fractures 2017–2019 and 2020 Factor

2017–2019 2020

Age, mean (SD) median (range) Female sex (%)

54 (20) 56 (16–98) 60

52 (20) 53 (16–100) 59

Incidence rates were compared, assuming the population size was similar during the time period 2017–2020 and that the number of fractures has a Poisson distribution. 95% confidence intervals (CI) for differences in fracture incidence were obtained by approximating the Poisson distribution with the normal distribution. All statistics were calculated with IBM SPSS 25 (IBM Corp, Armonk, NY, USA) or SAS v 9.4 (SAS Institute, Cary, NC, USA). Ethics, funding, and potential conflicts of interest The study was approved by the Swedish Ethical Review Authority (August 12, 2020; reference number 2020-02783). No funding was obtained and no conflicts of interest were declared.

Results Number of fractures From March 15 to June 15, 2020, 417 patients with ankle fractures were registered. The mean age was 52 years (SD 20) and 58% were women. During the same time period in 2017–2019, 1,446 patients with ankle fractures presented at the same departments. The mean age was 54 years (SD 20) and 60% were women (Table 1). The monthly rate of ankle fractures was 139 for the observed months in 2020 and 161 in 2017–2019. This is a statistically significant decrease of 22 (CI –37 to –6) fractures/month, corresponding to a reduction of 14% for the observation period in 2020 compared with the same period in 2017–2019 (Table 2). Age and sex distribution There was a statistically significant reduction in the observed number of fractures in women of 16 (CI -28 to -4) fractures/ month, corresponding to a reduction of 16%. In the age group 70 years or older, the number of fractures was reduced by 12 (CI -19 to -4) fractures/month, constituting a 29% reduction (Table 2). Time periods When analyzed month by month, there were 45 (CI –72 to –19) fewer ankle fractures/month during the 1st 30-day period, corresponding to a reduction of 26%. For the following 2 x 30-day periods, no statistically significant changes were observed (Table 3).

Acta Orthopaedica 2021; 92 (4): 381–384

383

Table 2. Number of ankle fractures observed March 15 to June 15, 2017–2019 and 2020 Factor

Monthly rates

Observed number Estimated Change 2017– 2017– difference (%) from 2019 2020 2019 2020 (95% CI) 2017–2019

Total 1,446 417 161 139 –22 (–37 to -6) –14 Sex Male 574 173 64 58 –6 (–16 to 4) –10 Female 872 244 97 81 –16 (–28 to –4) –16 Age group < 30 251 68 28 23 –5 (–12 to 1) –19 30–49 302 108 34 36 2 (–5 to 10) 7 50–69 524 153 58 51 –7 (–17 to 2) –12 ≥ 70 369 88 41 29 –12 (–19 to –4) –29

Table 3. Observed number of ankle fractures 2017–2019 and 2020, analyzed month by month Monthly rates Observed number Estimated Change 2017– 2017– difference (%) from Period 2019 2019 2020 (95% CI) 2017–2019 March 15–April 14 April 15–May 15 May 16–June 15

523 431 492

174 144 164

129 144 144

–45 (–72 to –19) –26 0.3 (–27 to 28) 0.2 –20 (–48 to 8) –12

A further monthly subgroup analysis grouped by sex showed a decline of 34 (CI –54 to –13) fractures/month, representing a 31% reduction in the number of fractures among women during the 1st 30-day period studied in 2020. There was no reduction for the other periods or for men at any time point (Table 4). In the age group 30–49 years an increase of 13 (CI –1 to 27) fractures/month was seen for the second 30-day period between April 15 and May 15, corresponding to an increase of 43% (Table 4). The subgroup analysis also revealed 18 (CI –31 to –4) fewer fractures/month during the 1st 30-day period in the age group of 70 years or older, constituting a reduction of 36%. Distribution between AO fracture groups and treatment type During the observed months 2017–2019 compared with the same time period in 2020, the distribution of fractures by AO fracture group was similar. Likewise, regarding treatment type, similar distribution was observed between surgical and non-surgical treatment (data not shown).

Discussion Our most important finding is a reduction in the number of

Table 4. Observed number of ankle fractures in 2017–2019 and 2020, analyzed month by month and sub-grouped for sex and into age groups Monthly rates Observed number Estimated Change 2017– 2017– difference (%) from Period 2019 2019 2020 (95% CI) 2017–2019 March 15–April 14 Male Female April 15–May 15 Male Female May 16–June 15 Male Female March 15–April 14 Age < 30 Age 30–49 Age 50–69 Age ≥ 70 April 15–May 15 Age < 30 Age 30–49 Age 50–69 Age ≥ 70 May 16–June 15 Age < 30 Age 30–49 Age 50–69 Age ≥ 70

200 323

67 108

55 74

–12 (–29 to 6) –18 –34 (–54 to –13) –31

160 271

53 90

56 88

3 (–14 to 20) –2 (–24 to 19)

5 –3

214 278

71 93

62 82

–9 (–28 to 9) –11 (–32 to 10)

–13 –12

82 105 190 146

27 35 63 49

22 28 48 31

–5 (–16 to 6) –7 (–19 to 5) –15 (–32 to 1) –18 (–31 to –4)

–19 –20 –24 –36

74 90 153 114

25 30 51 38

19 43 52 30

–6 (–16 to 5) 13 (–1 to 27) 1 (–15 to 17) –8 (–21 to 5)

–23 43 2 –21

95 107 181 109

32 36 60 36

27 37 53 27

–5 (–17 to 7) 1 (–12 to 15) –7 (–24 to 9) –9 (–22 to 3)

–15 4 –12 –26

ankle fractures during the 1st wave of the Covid-19 pandemic in 7 investigated departments in Sweden. The reduction occurred in women, and was most pronounced in patients aged 70 years or older, and for the 1st 30-day period of the pandemic. Women and people over the age of 70 apparently had the highest adherence. Our findings are in line, but with a lower incidence reduction, with the findings in the recent study by Haskel et al. (2020), who reported a 77% reduction in the number of ankle fractures during the 1st month of the pandemic in a level 1 trauma center in New York City. The New York lockdown was strict: all non-essential businesses closed, discouragement of public transportation, individuals to stay at least 6 feet from one another, and social gatherings of any size prohibited, i.e. regulations much stronger than those applied in Sweden. Our study has a longer observation time and reports a reduction in the number of ankle fractures for the entire time period, as well as for the 1st month. This could be interpreted as meaning that the adherence to recommendations on social distancing was higher during the 1st month of the pandemic than during the following months. Furthermore, we found a reduction in ankle fractures in patients aged > 70 years for the entire study period, as well as the 1st 30-day period. During the 1st wave of the pandemic, the Swedish authorities issued specific recommendations for people in this age group. The observed

384

decrease in the number of fractures could be interpreted as strict adherence to these recommendations by the elderly, especially during the period from mid-March to mid-April. This is supported by the findings by Ponkilainen et al. (2020) reporting a decrease in the number of hip fracture patients, interpreted as a good adherence from senior citizens to recommendations on social distancing and lessening of activities in Finland. In Hong Kong, there was a 20% decrease in surgically treated lower limb fractures during a somewhat earlier Covid19 wave from January 25 to March 25 compared with the preceding 4 years, which is in line with our findings (Wong and Cheung 2020). In London, there was a substantial decrease in the number of trauma referrals, admissions, and operations, suggesting that governmental recommendations actually had an effect on activity in the population (Park et al. 2020, Sugand et al. 2020). We found no reduction in the number of ankle fractures among men, and this is difficult to explain. Maybe women adhered to the recommendations from the authorities to a greater extent. An increase in number of fractures was seen during the second 30-day period for the age group 30–49 years. A similar rebound but for hip fractures was shown in Finland by Ponkilainen et al. (2020). A strength of our study is the fact that the time period of the Covid-19 pandemic in spring 2020 was compared with a mean for the same time period over the past 3 years. Most other studies have compared 2020 with 1 other year, which might lead to an overinterpretation of the differences seen for single years. Another strength is that it includes 7 different orthopedic departments in Sweden, while most other studies are single center (Haskel et al. 2020, Lubbe et al. 2020, Park et al. 2020). A possible limitation of our study is that, even though the included departments have a history of rapid fracture registration entry in the SFR, the timeframe between the studied time period and data extraction from the SFR is short and, as a result, some fractures may not have been entered before data extraction. In conclusion, we observed a decline in the number of ankle fractures during the 1st wave of the Covid-19 pandemic and the subsequent social distancing and reduction of activities in society. The reduction in the number of ankle fractures was greatest during the 1st 30-day period, among women and people aged > 70 years. This may indicate a greater adherence to government recommendations in these sub-groups and during the first period of the pandemic. Ankle fracture incidence may reflect the extent of lockdowns.

EMR planned and conducted the study and the statistical calculations, analyzed data, and wrote the first draft of the manuscript. JE conducted the statistical calculations and revised the manuscript. MM, OW, and DW planned and conducted the study and revised the manuscript.

Acta Orthopaedica 2021; 92 (4): 381–384

The authors would like to thank all the orthopedic surgeons at the affiliated departments for entering detailed data in the Swedish Fracture Register. Acta thanks Ville Mattila and Kapil Sugand for help with peer review of this study. Bergh C, Wennergren D, Moller M, Brisby H. Fracture incidence in adults in relation to age and gender: a study of 27,169 fractures in the Swedish Fracture Register in a well-defined catchment area. PLoS One 2020; 15(12): e0244291. doi: 10.1371/journal.pone.0244291. Government Offices of Sweden. Strategy in response to the COVID-19 pandemic. https://www.government.se/articles/2020/04/strategy-in-responseto-the-covid-19-pandemic/; 2020. Haskel J D, Lin C C, Kaplan D J, Dankert J F, Merkow D, Crespo A, Behery O, Ganta A, Konda S R. Hip fracture volume does not change at a New York City Level 1 Trauma Center during a period of social distancing. Geriatr Orthop Surg Rehabil 2020; 11. doi: 10.1177/2151459320972674. Juto H, Moller M, Wennergren D, Edin K, Apelqvist I, Morberg P. Substantial accuracy of fracture classification in the Swedish Fracture Register: Evaluation of AO/OTA-classification in 152 ankle fractures. Injury 2016; 47(11): 2579-83. doi: 10.1016/j.injury.2016.05.028. Kuitunen I, Ponkilainen V T, Launonen A P, Reito A, Hevonkorpi T P, Paloneva J, Mattila V M. The effect of national lockdown due to COVID19 on emergency department visits. Scand J Trauma Resusc Emerg Med 2020; 28(1): 114. doi: 10.1186/s13049-020-00810-0. Lubbe R J, Miller J, Roehr C A, Allenback G, Nelson K E, Bear J, Kubiak E N. Effect of statewide social distancing and stay-at-home directives on orthopaedic trauma at a Southwestern Level 1 Trauma Center during the COVID-19 pandemic. J Orthop Trauma 2020; 34(9): e343-e8. doi: 10.1097/BOT.0000000000001890. Park C, Sugand K, Nathwani D, Bhattacharya R, Sarraf K M. Impact of the COVID-19 pandemic on orthopedic trauma workload in a London level 1 trauma center: the “golden month”. Acta Orthop 2020; 91(5): 556-61. doi: 10.1080/17453674.2020.1783621. Ponkilainen V, Kuitunen I, Hevonkorpi TP, Paloneva J, Reito A, Launonen A P, Mattila V M. The effect of nationwide lockdown and societal restrictions due to COVID-19 on emergency and urgent surgeries. Br J Surg 2020; 107(10): e405-e6. doi: 10.1002/bjs.11847. Public Health Agency of Sweden. COVID-19. https://wwwfolkhalsomyndighetense/the-public-health-agency-of-sweden/communicable-diseasecontrol/covid-19/; 2020. Rydberg E M, Zorko T, Sundfeldt M, Moller M, Wennergren D. Classification and treatment of lateral malleolar fractures: a single-center analysis of 439 ankle fractures using the Swedish Fracture Register. BMC Musculoskelet Disord 2020; 21(1): 521. doi: 10.1186/s12891-020-03542-5. Sugand K, Park C, Morgan C, Dyke R, Aframian A, Hulme A, Evans S, Sarraf K M, Baker C, Bennett-Brown K, Simon H, Bray E, Li L, Lee N, Pakroo N, Rahman K, Harrison A. Impact of the COVID-19 pandemic on paediatric orthopaedic trauma workload in central London: a multi-centre longitudinal observational study over the “golden weeks”. Acta Orthop 2020; 91(6): 633-8. doi: 10.1080/17453674.2020.1807092. Swedish Fracture Register. Annual Report 2017; 2018. Swedish Fracture Register. Annual Report 2018; 2019. Swedish Fracture Register. Annual report 2019; 2020. doi: 10.18158BkOFY5vhL. Wennergren D, Ekholm C, Sandelin A, Moller M. The Swedish fracture register: 103,000 fractures registered. BMC Musculoskelet Disord 2015; 16: 338. doi: 10.1186/s12891-015-0795-8. Wennergren D, Stjernstrom S, Moller M, Sundfeldt M, Ekholm C. Validity of humerus fracture classification in the Swedish fracture register. BMC Musculoskelet Disord 2017; 18(1): 251. doi: 10.1186/s12891-017-1612-3. Wong J S H, Cheung K M C. Impact of COVID-19 on orthopaedic and trauma service: an epidemiological study. J Bone Joint Surg Am 2020; 102(14): e80. doi: 10.2106/JBJS.20.00775.

Acta Orthopaedica 2021; 92 (4): 385–393

385

Availability and reporting quality of external validations of machinelearning prediction models with orthopedic surgical outcomes: a systematic review Olivier Q GROOT 1,2, Bas J J BINDELS 2, Paul T OGINK 2, Neal D KAPOOR 1, Peter K TWINING 1, Austin K COLLINS 1, Michiel E R BONGERS 1, Amanda LANS 1,2, Jacobien H F OOSTERHOFF 1, Aditya V KARHADE 1, Jorrit-Jan VERLAAN 2, and Joseph H SCHWAB 1 1 Orthopedic

Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA; 2 Department of Orthopedic Surgery, University Medical Center Utrecht, Utrecht University, The Netherlands Correspondence: oqgroot@gmail.com Submitted 2021-01-21. Accepted 2021-03-10.

Background and purpose — External validation of machine learning (ML) prediction models is an essential step before clinical application. We assessed the proportion, performance, and transparent reporting of externally validated ML prediction models in orthopedic surgery, using the Transparent Reporting for Individual Prognosis or Diagnosis (TRIPOD) guidelines. Material and methods — We performed a systematic search using synonyms for every orthopedic specialty, ML, and external validation. The proportion was determined by using 59 ML prediction models with only internal validation in orthopedic surgical outcome published up until June 18, 2020, previously identified by our group. Model performance was evaluated using discrimination, calibration, and decision-curve analysis. The TRIPOD guidelines assessed transparent reporting. Results — We included 18 studies externally validating 10 different ML prediction models of the 59 available ML models after screening 4,682 studies. All external validations identified in this review retained good discrimination. Other key performance measures were provided in only 3 studies, rendering overall performance evaluation difficult. The overall median TRIPOD completeness was 61% (IQR 43–89), with 6 items being reported in less than 4/18 of the studies. Interpretation — Most current predictive ML models are not externally validated. The 18 available external validation studies were characterized by incomplete reporting of performance measures, limiting a transparent examination of model performance. Further prospective studies are needed to validate or refute the myriad of predictive ML models in orthopedics while adhering to existing guidelines. This ensures clinicians can take full advantage of validated and clinically implementable ML decision tools.

Multiple machine learning (ML) algorithms have recently been developed for prediction of outcomes in orthopedic surgery. A recent systematic review demonstrated that 59 models are currently available covering a wide variety of surgical outcomes, such as survival, postoperative complications, hospitalization, or discharge disposition to aid clinical decisionmaking (Ogink et al. 2021). However, it is imperative that these models are accurate, reliable, and applicable to patients outside the developmental dataset. Even though internal validation studies regularly report good performance, these results are often too optimistic as performance on external validation worsens due to initial overfitting (Collins et al. 2014, Siontis et al. 2015). External validation refers to assessing the model’s performance on a dataset that was not used during development. Testing the developed model on independent datasets addresses the aforementioned concerns of internal validation, including: the generalizability of the model in different patient populations, shortcomings in statistical modelling (e.g., incorrect handling of missing data), and model overfitting (Collins et al. 2014, 2015). Therefore, external validation is essential before a model can be used in routine clinical practice. Although a growing number of ML prediction models are being developed in orthopedics, no overview exists of the number of available ML prediction models that are externally validated, how they perform in an independent dataset, and what the transparency of reporting is of these external validation studies. Therefore, we assessed the proportion, performance, and transparent reporting of externally validated ML prediction models in orthopedic surgery, using the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) guidelines.

386

Records identified through expertise network of all authors n=3

Acta Orthopaedica 2021; 92 (4): 385–393

Records identified through PubMed, n = 724 Embase, n = 705 Cochrane, n = 43

Records identified through first and last author n = 6,042

Records after duplicates removed, screened on title and abstract n = 4,682 Excluded n = 4,627 Full-text articles assessed for eligibility n = 55 Excluded (n = 40): – internal validation, 25 – no surgical outcome, 15 – no prediction model, 3 – no ML, 4 – other exclusion criteria, 0 Included external validations n = 18

Figure 1. Flowchart of study selection.

Material and methods Systematic literature search Adhering to the 2009 PRISMA guidelines, this review was registered online at PROSPERO (Moher et al. 2016). A systematic search was conducted in PubMed, Embase and Cochrane up to November 17, 2020. 3 different domains of medical subject headings (MeSH) terms and keywords were combined with “AND”, and within domains the terms were combined with ‘OR’. The 3 domains included words related to orthopedics, ML, and external validation. In addition, we searched the first and last authors from the 59 ML prediction models previously identified in a systematic review by our study group combined with the domain “machine learning” (Appendix 1, see Supplementary data) (Ogink et al. 2021). 2 authors (NDK, PKT) independently screened all titles and abstracts. All references of the included studies were examined for relevant studies not identified by the initial search. The final list of included studies was sent to all coauthors, all of whom had worked with and/or published ML prediction models in orthopedics for a last check of potentially missed studies (Figure 1). Eligibility criteria Inclusion criteria were: external validation; prediction models based on ML; and orthopedic surgical outcome (defined as any outcome after musculoskeletal surgery). Exclusion criteria were: non-ML prediction model (e.g., standard logistic regression); internal validation (e.g., cross-validation and holdout test set from developmental dataset); lack of full text; conference abstracts; animal studies; and languages other than English, Spanish, German, or Dutch. We considered advanced logistic regression methods as ML algorithms such

as penalized LR (LASSO, ridge or elastic-net), boosted LR and bagged LR. Data extraction Data extracted from each study were: year of publication; 1st author; disease; type of surgery; prospective study design; level of care from which the dataset originates (e.g., tertiary); country; type of ML algorithm (e.g., Bayesian Belief Network); sample size; input features; predicted outcome; time points of outcome; performance measures according to the ABCD approach (Steyerberg and Vergouwe 2014) (A = calibration-in-the-large, or the model intercept; B = calibration slope; C = discrimination, with an area under the curve [AUC] using evaluation metrics of receiver operating characteristic [ROC] curves or precision-recall [PR] plots; D = decisioncurve analysis); mention of guideline adherence; TRIPOD items (Collins et al. 2015); and PROBAST domains (Wolff et al. 2019). Data were extracted from the largest cohort when multiple cohorts were present and the best performing model if a study reported results for multiple outcomes (e.g., 90-day and 1-year survival). Performance measures of the developmental study were extracted to compare with the results of external validation. 2 reviewers (OQG, BJJB) independently extracted all data and disagreements were discussed with a third reviewer present (PTO) until consensus was achieved. TRIPOD and PROBAST The TRIPOD guidelines were simultaneously published in 11 leading medical journals in January 2015 (Collins et al. 2015). Although various other guidelines exist (von Elm et al. 2007, Luo et al. 2016), we deemed the TRIPOD guidelines essential for transparent reporting requirements, which is imperative when judging the validity and applicability of a prediction model. Also, the TRIPOD guidelines were developed entirely for transparent reporting of prognosis or diagnosis prediction model studies (Figure 2 and 3, see Supplementary data). The PROBAST assesses the risk of bias of a study that validates a prognostic prediction model (Wolff et al. 2019). It is specifically designed to grade studies included in a systematic review. 4 domains are assessed for risk of bias: (1) participants; (2) predictors; (3) outcome; (4) and analysis (Figure 4, see Supplementary data). Statistics The proportion of externally validated ML prediction models in orthopedic surgical outcome was calculated by dividing 59 models by the externally validated models identified through this current study. Our group previously found 59 ML prediction models using only internal validation meeting the same criteria (except the criterium was “developmental” instead of “external validation”) in a systematic search dated up until June 18, 2020 (Groot et al. 2021, Ogink et al. 2021). Of the identified external validation studies, we determined how many unique models were externally validated, as 1 model

Acta Orthopaedica 2021; 92 (4): 385–393

387

Table 1. Characteristics of external validation studies on orthopedic surgical outcome prediction (n = 18) Number Adherence First author, ML Prospective Input of to a publication year Disease condition Operation model database Output predictors patients guideline Anderson, 2020 Pathological fractures nos BBN no Survival Bongers, 2019 Extracranial chondrosarcoma nos BPM no Survival Bongers, 2020a Extracranial chondrosarcoma nos BPM no Survival Bongers, 2020b Bone metastases (spine) nos SGB no Survival Forsberg, 2012 Bone metastases (extremities) nos BBN no Survival Forsberg, 2017 Bone metastases nos BBN no Survival Harris, 2019 nos Elective TJA LASSO no Survival, complications Huang, 2019 Non-metastatic chondrosarcoma nos LASSO no Survival Jo, 2020 nos TKA GBM no Transfusion Karhade, 2020 Bone metastases (spine) nos SGB no Survival Ko, 2020 nos TKA GBM no Acute kidney injury Meares, 2019 Bone metastases (femoral) nos BBN no Survival Ogura, 2017 Bone metastases nos BBN no Survival Overmann, 2020 Bone metastases (extremities) nos BBN no Survival Piccioli, 2015 Bone metastases nos BBN no Survival Ramkumar, 2019a Osteoarthritis THA ANN no LOS; discharge disposition Ramkumar, 2019b Osteoarthritis TKA ANN no LOS; discharge disposition Stopa, 2019 Lumbar disc disorder Decompression NN no Nonhome or fusion discharge

Clinical 197 Clinical 179 Clinical 464 Clinical 200 Clinical 815 Clinical 815 Clinical 70,569

TRIPOD none TRIPOD TRIPOD none TRIPOD none

Clinical, 72 surgical Clinical, 400 surgical Clinical 176 Clinical, 455 surgical Clinical 114 Clinical 261 Clinical 815 Clinical 287 Clinical 2,771

none

none none none none none

Clinical

none

Clinical, surgical

4,017 144

none TRIPOD none

TRIPOD

ML = machine learning; nos = not otherwise specified; TJA = total joint arthroplasty; TKA = total knee arthroplasty; THA = total hip arthroplasty; BBN = Bayesian Belief Network; NN = neural network; BPM = Bayes Point Machine; SGB = Stochastic Gradient Boosting; LASSO = least absolute shrinkage and selection operator; GBM = gradient boosting machine; LOS = length of stay; TRIPOD = Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis.

can be externally validated multiple times with different datasets. 1 incremental value study was found, which also reported on external validation. Only the external validation part was assessed. Performance measures were extracted and expressed as they were originally reported (Steyerberg and Vergouwe 2014). No meta-analysis could be performed because of obvious heterogeneity between studies. Adherence to the TRIPOD guidelines and PROBAST domains was expressed in percentages and visualized by graphs. We used Microsoft Excel Version 19.11 (Microsoft Corp, Redmond, WA, USA) to extract data using standardized forms, and to create all figures and tables, and Mendeley Desktop Version 1.19.4 (Mendeley, London, UK) as reference software. Ethics, funding, and potential conflicts of interest As there was no contact with patients and no study interventions were performed, permission from our institutional review board was not required. The study was supported by a grant from the Foundation “De Drie Lichten” in The Netherlands (€7.195). The authors reported no further funding disclosures or conflicts of interest.

Results Study characteristics 4,682 unique studies were identified of which 15 remained after full-text screening. 3 studies missed by the search were added by the coauthor’s expertise network (Forsberg et al. 2012, 2017, Piccioli et al. 2015, Ogura et al. 2017, Bongers et al. 2019, Harris et al. 2019, Huang et al. 2019, Jo et al. 2019, Meares et al. 2019, Ramkumar et al. 2019a, 2019b, Stopa et al. 2019, Anderson et al. 2020, Bongers et al. 2020a, 2020b, Ko et al. 2020, Karhade et al. 2020, Overmann et al. 2020). None of the external validations used a prospective cohort and 12/18 investigated survival in bone oncology (Table 1). 6/18 mentioned adherence to the TRIPOD guidelines, but none included the actual checklist. All studies were affiliated with 6 institutions of which 7/18 with PATHFx and 5/18 with SORG (Figure 5, see Supplementary data). 17/18 had at least 1 author who was also an author on the paper that developed the model being evaluated. 9/18 of the studies reported on both development and external validation in the same paper; the other 9 only reported on external validation. All of the ML prediction models were freely available at www.pathfx.org, www.sorg-ai.com, safetka.net/, http://med.stanford.edu/s-

388

Acta Orthopaedica 2021; 92 (4): 385–393

Table 2. Characteristics of hospital setting and years of enrollment from external validation and corresponding developmental studies Authors’ development and validation First author, Model or institution the same publication Country Tertiary Hospitals Registry

Years of year enrollment

Cleveland Validation yes Ramkumar, 2019a USA mixed 11 no 2016–2018 Development Same USA mixed multiple NIS 2009–2011 Cleveland Validation yes Ramkumar, 2019b USA mixed 11 no 2016–2018 Development Same USA mixed multiple NIS 2009–2013 BETS/ PATHFx 1.0 Validation yes Forsberg, 2012 Scandinavia yes 8 no 1999–2009 Development Forsberg, 2011 USA yes 1 no 1999–2003 PATHFx 1.0 Validation yes Piccioli, 2015 Italy yes 13 no 2010–2013 Development Forsberg, 2011 USA yes 1 no 1999–2003 PATHFx 1.0 Validation yes Forsberg, 2017 Scandinavia yes 8 no 1999–2009 Development Same USA yes 1 no 1999–2003 PATHFx 1.0 Validation yes Ogura, 2017 Japan yes 5 no 2009–2015 Validation no Meares, 2019 Australia unknown 1 no 2003–2014 Development Forsberg, 2011/2017 USA yes 1 no 1999–2003 PATHFx 2.0 Validation yes Overmann, 2020 Scandinavia yes 8 no 1999–2009 Development Same USA yes 1 no 1999-2003 a PATHFx 3.0 Validation yes Anderson, 2020 Multinational yes multiple IBMR 2016–2018 Development Same USA yes 1 no 1999–2003, 2015–2018 SafeTKA Validation yes Jo, 2020 unknown unknown 1 no unknown Development Same South-Korea yes 1 no 2012–2018 SafeTKA Validation yes Ko, 2020 South-Korea yes 1 no 2018–2019 Development Same South-Korea yes 2 no 2012–2019 SORG Validation yes Bongers, 2019 USA yes 2 no 1992–2013 Development Thio, 2018 USA mixed multiple SEER 2000–2010 SORG Validation yes Bongers, 2020a Italy yes 1 no 2000–2014 Validation yes Karhade, 2020 USA yes 1 no 2003–2016 Development Karhade, 2020 USA yes 2 no 2000–2016 SORG Validation yes Bongers, 2020b USA yes 1 no 2014–2016 Validation yes Stopa, 2019 USA yes 1 no 2013–2015 Development Karhade, 2018 USA mixed multiple NSQIP 2011–2016 Stanford Validation yes Harris, 2019 USA mixed multiple VASQIP 2005–2013 Development Same USA mixed multiple NSQIP 2013–2014 Zhengzhou Validation yes Huang, 2019 China yes 1 no 2011–2016 Development Same USA mixed multiple SEER 2005–2014 BETS = Bayesian Estimated Tools for Survival; SORG = Spinal Oncology Research Group; NSQIP = National Surgical Quality Improvement Program; SEER = Surveillance, Epidemiology, and End Results; IBMR = International Bone Metastasis Registry; NIS = National Inpatient Sample; VASQIP = Veterans Affairs Surgical Quality Improvement Program. a This study also included an external validation on a second registry cohort of 192 patients from the Military Health System Data Repository.

spire/Resources/clinical-tools-.html, and https://github.com/ JaretK/NeuralNetArthroplasty. 17 datasets were used because 3 studies used 1 Scandinavian dataset and 1 study included 2 validation registry cohorts (Table 2). 14/17 of the datasets originated from hospitals, the other 3 were from a registry. The median sample size of the external validation datasets was 274 patients (IQR, 178–552) and 7/17 were American datasets (Figure 6). Proportion This systematic review identified 18 external validation studies of ML models predicting outcomes in orthopedic surgery. In these 18 external validation studies, 10 unique ML prediction models were validated as 2 models were validated twice, and 1 model 7 times as it was validated and updated multiple times with distinct datasets. Therefore, 10/59 of the

ML models predicting outcomes in orthopedic surgery published until June 18, 2020 were externally validated. Of the 10 models, 3 were externally validated with patients from another country than the developmental cohort, including 1 model by 4 different countries. Performance All studies reported the ROC AUC, which retained good discriminative ability with a value greater than 0.70 and/or less than 0.10 decreased performance compared with the corresponding development study (Table 3 and Figure 7, see Supplementary data). No PR AUC evaluation metrics were provided, despite 3/18 of the datasets consisting of imbalanced class distribution in which the ratio events:non-events was greater than 1:10. Calibration intercept and slope, or curve, were reported in 7/18. 5/18 reported calibration slope or

Acta Orthopaedica 2021; 92 (4): 385–393

389

Figure 6. Distribution of development and external validation studies. All of the developmental studies that were externally validated except 2 South Korean ones were built on American datasets, unlike the origin of the external validation studies. Symbols without a number correspond with 1 study. Studies that included both development and external validation within the same study were counted twice in the figure according to where both datasets originated from. Adherence (%) 100 Low 80

Participants

Predictors

High

Unclear

Outcome

Analysis

Overall risk of bias 0

2 3a 3b 4a 4b 5a 5b 5c 6a 6b 7a 7b 8

Title and Intro abstract

Methods

9 10c 10d 12 13a 13b 13c 16 18 19a 19b 20 21 22

Results

Discussison Other

Figure 8. Overall adherence to each TRIPOD item (n = 18).

curves that showed overall underfitting of the data. Decision curve analyses were provided in 9/18, all of which illustrated that the prediction models were suitable for clinical use. TRIPOD and PROBAST The overall median completeness of the TRIPOD items was 61% (IQR 43–90%; Figure 8 and Table 4, see Supplementary data). All method items adhered to a median completeness of 56% (IQR 44–72%) and all results items to a median of 42% (IQR 22–61%). 6 items were reported in more than 16 studies including 3 discussion items (Table 5). 6 items were reported in less than 4 studies, including details of abstract, participant selection, and reporting key performance measures. Participant selection (domain 1) was considered an unclear risk of bias in 10 studies because no information was provided on the inclusion and exclusion of patients (Figure 9). Predictors (domain 2) were deemed a low risk of bias in 16 studies, as 2 studies were unclear in their predictor’s definitions and assessment. Outcome (domain 3) was rated a high risk of bias in 2 studies as they did not determine survival in a similar way

100

Distribution of bias risks (%)

TRIPOD items

Figure 9. PROBAST results for all 4 domains and overall judgement (n = 18).

for all patients by assigning “death” to all patients lost to follow-up. 2 additional studies in the outcome domain were rated an unclear risk of bias because it was difficult to discern if they used the same postoperative complication definitions for both the development and external validation study. Analysis (domain 4) was rated a high risk of bias in 17 studies, mainly due to small sample sizes with less than 100 events in the outcome group or no calibration metrics. The overall judgement of risk of bias for the 18 studies was high in 17 studies and low in 1 study, as only 1 study scored “low risk of bias” across all 4 domains.

Discussion The focus on developing and publishing ML prediction models has led to an increasing body of studies. Yet, it is of equal importance to externally validate these models, as the TRIPOD states in its guidelines: “external validation is an invaluable and crucial step in the introduction of a new predic-

390

Acta Orthopaedica 2021; 92 (4): 385–393

Table 5. Sorted by completeness of above 90% reporting and under 25% of individual TRIPOD items TRIPOD item

TRIPOD description % (n)

Complete reporting > 90% 3a Explain the medical context (including whether diagnostic or prognostic) and rationale for validating the multivariable prediction model, including references to existing models 100 (18) 4a Describe the study design or source of data (e.g., randomized trial, cohort, or registry data), separately for the validation data set 100 (18) 19b Give an overall interpretation of the results considering objectives, limitations, results from similar studies and other relevant evidence 100 (18) 22 Give the source of funding and the role of the funders for the present study 100 (18) 6b Report any actions to blind assessment of the outcome to be predicted 94 (17) 19a Discuss the results with reference to performance in the development data, and any other validation data 94 (17) Complete reporting < 25% 2 Provide a summary of objectives, study design, setting, participants, sample size, predictors, outcome, statistical analysis, results, and conclusions 0 (0) 7b Report any actions to blind assessment of predictors for the outcome and other predictors 0 (0) 7a Clearly define all predictors used in validating the multivariable prediction model, including how and when they were measured 11 (2) 5c Give details of treatments received, if relevant 22 (4) 13a Describe the flow of participants through the study, including the number of participants with and without the outcome and, if applicable, a summary of the follow-up time. A diagram may be helpful 22 (4) 16 Report performance measures (with confidence intervals) for the prediction model (results) 22 (4)

from distinct geographic sites is needed, as the generalizability of models to other countries may be affected by differences in healthcare systems, predictor measurements, and treatment strategies (Steyerberg et al. 2013). Although the recent surge of ML models in orthopedics is exciting, it is critical that these models are tested with external, real-world, operational data in different geographical settings before the orthopedic community can fully embrace the models in clinical practice.

tion model before it should be considered for routine clinical practice.” Although the external validation studies identified in this review retained good discriminatory performance and overall adhered well to the TRIPOD guidelines, only 10/50 of the ML models predicting orthopedic surgical outcome published up to June 2020 have been externally validated. Skepticism of these non-externally validated models is necessary and an increased effort in externally validating existing models is required to realize the full potential of ML prediction models.

Performance The external validations identified in this review retained good discrimination. Other key characteristics recommended evaluating a model’s performance such as calibration, and whether decision-curve analysis was inadequately or not reported, as observed here and in similar reviews (Collins et al. 2011, 2014, Bouwmeester et al. 2012, Tangri et al. 2013). Calibration measures were provided in only 7 of the 18 studies, preventing a transparent examination of model performance across the range of predicted probabilities (Steyerberg and Vergouwe 2014). Lastly, and arguably more important than the other metrics, is clinical usefulness evaluated by decision-curve analysis (Vickers and Elkin 2006). All 9 of the 18 studies that reported a decision-curve analysis indicated that the models were suitable for clinical use. Importantly, these curves do not estimate the likelihood of the outcome, but rather illustrate when the model should and should not be used in certain clinical situations over a range of thresholds. Overall, only 3 studies provided all 4 key measures to evaluate performance reliably, despite a substantial body of methodological literature and published guidance emphasizing the importance of these performance measures (von Elm et al. 2007, Steyerberg and Vergouwe 2014, Collins et al. 2015, Luo et al. 2016). Clinical researchers should use proposed frameworks such as Steyerberg’s ABCD approach to systematically report the performance of a validated model to allow accurate evaluation (Steyerberg and Vergouwe 2014). An additional interesting find is that 17 of the 18 studies were conducted by authors involved in the development of the model. Authors evaluating their own model might be overly optimistic, selectively report the results to their own advantage, and even defer publication if the performance is poor (Siontis et al. 2015). Although validating one’s model is an essential first step, ideally this should be done by researchers not affiliated with the developmental study.

Proportion A disappointingly low 10/59 of the current available ML prediction models were externally validated in orthopedic surgical outcome with none of the datasets being prospective. Prospectively testing the performance of ML models under real-world circumstances is an essential step towards integrating these models into the clinical setting and evaluating the impact on healthcare (Collins et al. 2015). In addition, increased effort towards external validation on patient data

TRIPOD and PROBAST Although the external validations fared better in overall TRIPOD adherence than their corresponding developmental studies, they too had numerous incomplete items. The abstract, for which complete reporting required information on 12 elements, was incomplete in all studies. Some basic key details such as defining predictor definitions, outcome, or treatment elements were poorly reported, despite not being specific to ML external validation studies. Specifying and reporting per-

TRIPOD = Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis.

Acta Orthopaedica 2021; 92 (4): 385–393

formance measures was poorly done in over half of the studies. Despite 6 TRIPOD items scoring less than 25% (5 were methods/results), 11 items scored over 75%, which included mainly introduction and discussion items. This difference in adherence across sections perhaps illustrates that the orthopedic community comprehends the rationale, promise, and limitations of ML prediction models, but proper knowledge of methodological standards to describe and evaluate external validations studies is lacking. Standardized reporting and adherence to peer-reviewed guidelines such as the TRIPOD guidelines will aid in the execution and reporting of external validation studies, resulting in validated ML prediction models that are reliable, accurate, and that add to surgical decision-making (Collins et al. 2015). The PROBAST domains identified 2 major concerns in addition to the TRIPOD items. First, little attention was given to the flow of patient selection, as none of the studies included a flow diagram of included and excluded patients. Possibly, studies purposely did not include flow diagrams or selection criteria to maintain the generalizability of the model to patients outside the selection criteria, but studies should explicitly state this. Second, the sample sizes were often too small, as only 5 of the 17 validation datasets had more than 100 events in each outcome group. Previous studies have shown that calibration results are less reliable with datasets with less than 100 outcome events (Vergouwe et al. 2005). In most circumstances, it would have been difficult to reach this number as the disease conditions were primarily bone oncology related. To address the issue of inadequate number of outcomes, multi-institutional collaboration is needed to achieve effective sample sizes to allow reliable external validations. Limitations 1st, studies meeting the selection criteria may have been missed. However, we believe this was unlikely as we used 4 different search strategies. In addition, we believe that any missed studies would not have had a profound impact on the review’s message as the percentage of externally validated models was well below 20%. 2nd, 5 of the 18 included studies originated from the authors’ institution (SORG) and the reviewers may have been biased assessing them. To account for this potential bias, the 2nd reviewer (BJJB) was not affiliated with the institution, the PI was not present during the consensus meetings, and an online PROSPERO protocol was registered. 3rd, publication bias may have occurred as successful external validations may be published more often. The performance results presented in this review may therefore be too optimistic and the number of studies externally validated too pessimistic. Studies demonstrating poorer performing models are part of the implementation process and ideally should be equally embraced by journals as high-performing models. In addition, the AUCs presented in 3 studies may have been too optimistic as they used ROC metrics on imbalanced datasets. Future studies should provide PR AUC metrics for datasets with an imbalanced class

391

distribution (Saito and Rehmsmeier 2015). 4th, the presented low percentage of ML prediction models externally validated may have been unfair, as 20 ML models have been developed and published in the last year and external validation studies are time consuming. However, excluding the studies published in the last year to correct for this delay still only yielded a disappointing 18/39 of ML prediction models that were externally validated. In addition, not all published ML models are for deployment, as we are still exploring the potentials of ML and therefore publications’ primary motivation may be exploring the space of ML. Instead of externally validating these models, online tests should be provided where users can assess themselves how the ML models behave in different settings and parameters. Unfortunately, over half of the ML development studies did not provide online calculators, algorithms, and/or open access (Ogink et al. 2021). Future ML studies should place more emphasis on providing easy-to-access means where outside users can themselves assess model performance and behavior. 5th, various reporting guidelines exist such as STROBE and JMIR Guidelines for Developing and Reporting Machine Learning Models in Biomedical Research (von Elm et al. 2007, Luo et al. 2016). However, we used the TRIPOD guidelines to assess the transparent reporting as this guideline was explicitly developed to cover the development and validation of prediction models for prognosis (Collins et al. 2015). To improve on these guidelines, the TRIPOD authors are currently developing a TRIPOD-AI version specifically for reporting of AI prediction models (Collins and Moons 2019). 6th, the guidelines are endorsed by 21 medical journals, of which only 1 is orthopedic (Journal of Orthopedic & Sports Physical Therapy). Since none of the studies were published in journals that officially endorsed the TRIPOD, it may be unfair to expect compliance with these guidelines. However, we believe that the TRIPOD guidelines present a high-quality benchmark for assessing transparent reporting, which is necessary for externally validating existing models and creating clinically implementable ML prediction models. Despite these limitations, our review provides valuable insights into the amount and transparent reporting of current ML external validations in orthopedic surgical outcome prediction. Conclusion Despite the evident importance of evaluating the performance of prediction models on unseen datasets, this is rarely done as institutions are protective of sharing their data and journals prefer publishing development studies. In addition, algorithms that perform poorly on external validation may be subject to publication bias. The handful of available external validation studies overall adhered well to the TRIPOD guidelines, but certain items that are essential for transparent reporting were inadequately reported or not reported at all, namely details of the abstract, participant selection, and key performance measures. Increased effort to externally validate existing models on large, prospective, geographically distinct datas-

392

ets is required to ensure accurate and reliable validated ML prediction models. It will be difficult to achieve these types of datasets without multi-institutional collaboration across different geographic regions. We encourage researchers and institutions, from both within and outside the orthopedic ML community, to collaborate. Supplementary data Figures 2–5 and 7, Tables 3 and 4 and Appendix with search syntaxes are available as supplementary data in the online version of this article, http://dx.doi.org/10.1080/17453674.2021. 1910448

All authors have contributed to the research design and/or interpretation of data, and the drafting and revising of the manuscript. All authors have read and approved the final submitted manuscript. Acta thanks Fabian van de Bunt and Max Gordon for help with peer review of this study.

Anderson A B, Wedin R, Fabbri N, Boland P, Healey J, Forsberg J A. External validation of PATHFx version 3.0 in patients treated surgically and nonsurgically for symptomatic skeletal metastases. Clin Orthop Relat Res 2020; 478(4): 808-18. Bongers M E R, Thio Q C B S, Karhade A V, Stor M L, Raskin K A, Lozano Calderon S A, DeLaney T F, Ferrone M L, Schwab J H. Does the SORG algorithm predict 5-year survival in patients with chondrosarcoma? An external validation. Clin Orthop Relat Res 2019; 477(10): 2296-303. Bongers M E R, Karhade A V, Setola E, Gambarotti M, Groot O Q, Erdoğan K E, Picci P, Donati D M, Schwab J H, Palmerini E. How does the skeletal oncology research group algorithm’s prediction of 5-year survival in patients with chondrosarcoma perform on international validation? Clin Orthop Relat Res 2020a; 478(10): 2300-8. Bongers M E R, Karhade A V, Villavieja J, Groot O Q, Bilsky M H, Laufer I, Schwab J H. Does the SORG algorithm generalize to a contemporary cohort of patients with spinal metastases on external validation? Spine J 2020b; 20(10): 1646-52. Bouwmeester W, Zuithoff N P A, Mallett S, Geerlings M I, Vergouwe Y, Steyerberg E W, Altman D G, Moons K G M. Reporting and methods in clinical prediction research: a systematic review. PLoS Med 2012; 9(5): 1-12. Collins G S, Moons K G M. Reporting of artificial intelligence prediction models. Lancet 2019; 393(10181): 1577-9. Collins G S, Mallett S, Omar O, Yu L-M. Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Med 2011; 9: 103. Collins G S, de Groot J A, Dutton S, Omar O, Shanyinde M, Tajar A, Voysey M, Wharton R, Yu L-M, Moons K G, Altman D G. External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol 2014; 14: 40. Collins G S, Reitsma J B, Altman D G, Moons K G M. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med 2015; 13: 1. Forsberg J A, Eberhardt J, Boland P J, Wedin R, Healey J H. Estimating survival in patients with operable skeletal metastases: an application of a bayesian belief network. PLoS One 2011; 6(5): e19956. Forsberg J A, Wedin R, Bauer H C F, Hansen B H, Laitinen M, Trovik C S, Keller J Ø, Boland P J, Healey J H. External validation of the Bayesian Estimated Tools for Survival (BETS) models in patients with surgically treated skeletal metastases. BMC Cancer 2012; 12 :493.

Acta Orthopaedica 2021; 92 (4): 385–393

Forsberg J A, Wedin R, Boland P J, Healey J H. Can we estimate short- and intermediate-term survival in patients undergoing surgery for metastatic bone disease? Clin Orthop Relat Res 2017; 475(4): 1252–61. Groot O Q, Ogink P T, Lans A, Twining P K, Kapoor N D, DiGiovanni W, Bindels B J, Bongers M E, Oosterhoff J H, Karhade A V, Verlaan J-J, Schwab J H. Poor reporting of methods and performance measures by machine learning studies in orthopaedic surgery: a systematic review. J Orthop Res 2021. doi: 10.1002/jor.25036. Online ahead of print. Harris A H S, Kuo A C, Weng Y, Trickey A W, Bowe T, Giori N J. Can machine learning methods produce accurate and easy-to-use prediction models of 30-day complications and mortality after knee or hip arthroplasty? Clin Orthop Relat Res 2019; 477(2): 452–60. Huang R, Sun Z, Zheng H, Yan P, Hu P, Yin H, Zhang J, Meng T, Huang Z. Identifying the prognosis factors and predicting the survival probability in patients with non-metastatic chondrosarcoma from the SEER Database. Orthop Surg 2019; 11(5): 801-10. Jo C, Ko S, Shin W C, Han H-S, Lee M C, Ko T, Ro D H. Transfusion after total knee arthroplasty can be predicted using the machine learning algorithm. Knee Surg Sports Traumatol Arthrosc 2020; 28(6): 1757-64. Karhade A V, Ogink P, Thio Q, Broekman M, Cha T, Gormley W B, Hershman S, Peul W C, Bono C M, Schwab J H. Development of machine learning algorithms for prediction of discharge disposition after elective inpatient surgery for lumbar degenerative disc disorders. Neurosurg Focus 2018; 45(5): E6. Karhade A V, Ahmed A K, Pennington Z, Chara A, Schilling A, Thio Q C B S, Ogink P T, Sciubba D M, Schwab J H. External validation of the SORG 90-day and 1-year machine learning algorithms for survival in spinal metastatic disease. Spine J 2020; 20(1): 14–21 Ko S, Jo C, Chang C B, Lee Y S, Moon Y-W, Youm J W, Han H-S, Lee M C, Lee H, Ro D H. A web-based machine-learning algorithm predicting postoperative acute kidney injury after total knee arthroplasty. Knee Surg Sports Traumatol Arthrosc 2020; doi: 10.1007/s00167-020-06258-0. Online ahead of print. Luo W, Phung D, Tran T, Gupta S, Rana S, Karmakar C, Shilton A, Yearwood J, Dimitrova N, Ho T B, Venkatesh S, Berk M. Guidelines for developing and reporting machine learning predictive models in biomedical research: a multidisciplinary view. J Med Internet Res 2016; 18(12): e323. Meares C, Badran A, Dewar D. Prediction of survival after surgical management of femoral metastatic bone disease - A comparison of prognostic models. J Bone Oncol 2019; 15: 100225. Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, Shekelle P, Stewart L A, Estarli M, Barrera E S A, Martínez-Rodríguez R, Baladia E, Agüero S D, Camacho S, Buhring K, Herrero-López A, Gil-González D M, Altman D G, Booth A, Chan A W, Chang S, Clifford T, Dickersin K, Egger M, Gøtzsche P C, Grimshaw J M, Groves T, Helfand M, Higgins J, Lasserson T, Lau J, Lohr K, McGowan J, Mulrow C, Norton M, Page M, Sampson M, Schünemann H, Simera I, Summerskill W, Tetzlaff J, Trikalinos T A, Tovey D, Turner L, Whitlock E. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Rev Esp Nutr Humana y Diet. Asociacion Espanola de Dietistas-Nutricionistas 2016; 20(2): 148-60. Ogink P T, Groot O Q, Karhade A V, Bongers M E, Oner F C, Verlaan J-J, Schwab J H. Wide range of applications for machine learning prediction models in orthopedic surgical outcome: a systematic review. Accepted Acta Orthop; 2021. Ogura K, Gokita T, Shinoda Y, Kawano H, Takagi T, Ae K, Kawai A, Wedin R, Forsberg J A. Can a multivariate model for survival estimation in skeletal metastases (PATHFx) be externally validated using Japanese patients? Clin Orthop Relat Res 2017; 475(9): 2263-70. Overmann A L, Clark D M, Tsagkozis P, Wedin R, Forsberg J A. Validation of PATHFx 2.0: An open-source tool for estimating survival in patients undergoing pathologic fracture fixation. J Orthop Res 2020; 38(10): 214956. Piccioli A, Spinelli M S, Forsberg J A, Wedin R, Healey J H, Ippolito V, Daolio PA, Ruggieri P, Maccauro G, Gasbarrini A, Biagini R, Piana R, Fazioli F, Luzzati A, Di Martino A, Nicolosi F, Camnasio F, Rosa M A, Campanacci DA, Denaro V, Capanna R. How do we estimate survival?

Acta Orthopaedica 2021; 92 (4): 385–393

External validation of a tool for survival estimation in patients with metastatic bone disease-decision analysis and comparison of three international patient populations. BMC Cancer 2015; 15: 424. Ramkumar P N, Karnuta J M, Navarro S M, Haeberle H S, Iorio R, Mont M A, Patterson B M, Krebs V E. Preoperative prediction of value metrics and a patient-specific payment model for primary total hip arthroplasty: Development and validation of a deep learning model. J Arthroplasty 2019a; 34(10): 2228-2234.e1. Ramkumar P N, Karnuta J M, Navarro S M, Haeberle H S, Scuderi G R, Mont M A, Krebs V E, Patterson B M. Deep learning preoperatively predicts value metrics for primary total knee arthroplasty: Development and validation of an artificial neural network model. J Arthroplasty 2019b; 34(10): 2220-2227.e1. Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 2015; 10(3): e0118432. Siontis G C M, Tzoulaki I, Castaldi P J, Ioannidis J P A. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. J Clin Epidemiol 2015; 68(1): 25-34. Steyerberg E W, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J 2014; 35(29): 1925-31. Steyerberg E W, Moons K G M, van der Windt D A, Hayden J A, Perel P, Schroter S, Riley R D, Hemingway H, Altman D G. Prognosis Research Strategy (PROGRESS) 3: prognostic model research. PLoS Med 2013; 10(2): e1001381. Stopa B M, Robertson F C, Karhade A V, Chua M, Broekman M L D, Schwab J H, Smith T R, Gormley W B. Predicting nonroutine discharge

393

after elective spine surgery: external validation of machine learning algorithms. J Neurosurg Spine 2019; 1-6. doi: 10.3171/2019.5.SPINE1987. Online ahead of print. Tangri N, Kitsios G D, Inker L A, Griffith J, Naimark D M, Walker S, Rigatto C, Uhlig K, Kent D M, Levey A S. Risk prediction models for patients with chronic kidney disease: a systematic review. Ann Intern Med 2013; 158(8): 596-603. Thio Q C B S, Karhade A V, Ogink P T, Raskin K A, De Amorim Bernstein K, Lozano Calderon S A, Schwab J H. Can machine-learning techniques be used for 5-year survival prediction of patients with chondrosarcoma? Clin Orthop Relat Res 2018; 476(10): 2040-8. Vergouwe Y, Steyerberg E W, Eijkemans M J C, Habbema J D F. Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol 2005; 58(5): 475-83. Vickers A J, Elkin E B. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making 2006; 26(6): 565-74. von Elm E, Altman D G, Egger M, Pocock S J, Gøtzsche P C, Vandenbroucke J P. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. BMJ 2007; 335(7624): 806-8. Wolff R F, Moons K G M, Riley RD, Whiting P F, Westwood M, Collins G S, Reitsma J B, Kleijnen J, Mallett S. PROBAST: a tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 2019; 170(1): 51-8. Forsberg J A, Eberhardt J, Boland P J, Wedin R, Healey J H. Estimating survival in patients with operable skeletal metastases: an application of a bayesian belief network. PLoS One 2011; 6(5): e19956.

394

Acta Orthopaedica 2021; 92 (4): 394–400

Deep neural networks with promising diagnostic accuracy for the classification of atypical femoral fractures Georg ZDOLSEK 1, Yupei CHEN 2, Hans-Peter BÖGL 1,3, Chunliang WANG 1, Mischa WOISETSCHLÄGER 4,5, and Jörg SCHILCHER 1,6 1 Department of Orthopedics and Department of Biomedical and Clinical Sciences, Faculty of Health Science, Linköping University, Linköping; 2 Department of Biomedical Engineering and Health Systems, Royal Institute of Technology, Stockholm; 3 Department of Orthopedic Surgery, Gävle Hospital; 4 Department of Radiology and Department of Medical and Health Sciences, Linköping; 5 Center for Medical Image Science and Visualization, Linköping University, Linköping; 6 Wallenberg Centre for Molecular Medicine, Linköping University, Linköping, Sweden

Correspondence: jorg.schilcher@liu.se Submitted 2020-10-08. Accepted 2021-01-28.

Background and purpose — A correct diagnosis is essential for the appropriate treatment of patients with atypical femoral fractures (AFFs). The diagnostic accuracy of radiographs with standard radiology reports is very poor. We derived a diagnostic algorithm that uses deep neural networks to enable clinicians to discriminate AFFs from normal femur fractures (NFFs) on conventional radiographs. Patients and methods — We entered 433 radiographs from 149 patients with complete AFF and 549 radiographs from 224 patients with NFF into a convolutional neural network (CNN) that acts as a core classifier in an automated pathway and a manual intervention pathway (manual improvement of image orientation). We tested several deep neural network structures (i.e., VGG19, InceptionV3, and ResNet) to identify the network with the highest diagnostic accuracy for distinguishing AFF from NFF. We applied a transfer learning technique and used 5-fold cross-validation and class activation mapping to evaluate the diagnostic accuracy. Results — In the automated pathway, ResNet50 had the highest diagnostic accuracy, with a mean of 91% (SD 1.3), as compared with 83% (SD 1.6) for VGG19, and 89% (SD 2.5) for InceptionV3. The corresponding accuracy levels for the intervention pathway were 94% (SD 2.0), 92% (2.7), and 93% (3.7), respectively. With regards to sensitivity and specificity, ResNet outperformed the other networks with a mean AUC (area under the curve) value of 0.94 (SD 0.01) and surpassed the accuracy of clinical diagnostics. Interpretation — Artificial intelligence systems show excellent diagnostic accuracies for the rare fracture type of AFF in an experimental setting.

Atypical fractures occur at atypical locations in the femoral bone and show a strong association with bisphosphonate treatment (Odvina et al. 2005, 2010, Shane 2010, Shane et al. 2010, Schilcher et al. 2011, 2015, Starr et al. 2018). In contrast to the metaphyseal area, which is the site for the majority of all fragility fractures, the diaphyseal region is where atypical fractures occur. As is the case for any other insufficiencytype fracture of the diaphysis, atypical fractures show specific radiographic features, such as a transverse or short oblique fracture line in the lateral femoral cortex and focal cortical thickening (Schilcher et al. 2013, Shane et al. 2014). These features differ from those of normal femur fractures (NFFs), which show oblique fracture lines and no signs of focal cortical thickening (Shin et al. 2016b). Early and correct diagnosis of AFF is essential for appropriate management (Bogl et al. 2020a), which minimizes the risk of healing complications (Bogl et al. 2020b). In clinical routine practice, conventional radiographs are used to diagnose complete AFF. However, routine diagnostic accuracy is poor, and < 7% of AFF cases are correctly identified in this way (Harborne et al. 2016). Artificial intelligence (AI), deep learning through convolutional networks, has proven effective in the classification (Russakovsky et al. 2015) and segmentation (Ronneberger et al. 2015) of medical images in general, and for bone fractures in particular (Brett et al. 2009, Olczak et al. 2017, Chung et al. 2018, Kim and MacKinnon 2018, Lindsey et al. 2018, Adams et al. 2019, Urakawa et al. 2019, Kalmet et al. 2020). Given the very specific radiographic pattern of these fractures, AI appears to be a useful tool for finding the needle (AFF) in the haystack (NFF).

Acta Orthopaedica 2021; 92 (4): 394–400

Patients with subtrochanteric or diaphyseal femur fractures in Sweden 2008–2010 with ICD 10 codes S72.2 and 72.3 n = 5,342 Patients with subtrochanteric or diaphyseal femur fractures in Sweden 2008–2010 with ICD 10 codes S72.2 and 72.3, causal code W n = 1,124 Patients with atypical femoral fractures (AFF) n = 172

Patients with normal femoral fractures (NFF) n = 952

Patients with AFF and available radiographs n = 149

Patients with NFF and available radiographs n = 224

Number of available radiographs n = 433

Number of available radiographs n = 549

Figure 1. The study cohort recruitment process.

We evaluated the abilities of different deep neural networks to discriminate complete AFF from NFF on diagnostic plain radiographs in an experimental setting and we assessed the effect of limited user intervention on diagnostic accuracy.

Patients and methods Dataset and preprocessing procedures The original dataset of radiographs comprised patients with complete AFFs and NFFs derived from a cohort of 5,342 Swedish women and men aged ≥ 55 years who had suffered a fracture of the femoral shaft in the period 2008–2010 (Figure 1). We extracted and anonymized the radiographs of 151 subjects with AFF and 230 subjects with NFF owing to their accessibility from the research PACS at Linköping University Hospital and excluding all patients with signs of previous surgery, such as joint prosthesis or other orthopedic implants, and signs of pathological fractures (Schilcher et al. 2015). Fracture classification into AFF and NFF was based on repeated individual review in several previous studies with excellent interrater reliability (Schilcher et al. 2011, 2015, Bogl et al. 2020b). For each subject, several radiographs were available. During manual screening, images with extensive artefacts (e.g., splints or plasters) were excluded. The final dataset included 433 radiographs from 149 patients with AFF and 549 from 224 patients with NFF. For the purpose of image processing, the original images were converted from the Digital Imaging and Communication in Medicine (DICOM) format to the Joint Photographic Experts Group (JPEG lossless) format. To allow for transfer learning from ImageNet, grayscale images were converted to Red Green Blue (RGB) images with 3 channels (with identical duplication for each channel). Moreover, all the images were padded to a square size and downsampled to 256×256 pixels to reduce the amount of data and computing time for the whole image. Image intensity was normalized to

395

have a mean of 0 and standard deviation of 1. The images were augmented through random rotation (± 10°), shifting (< 10%), and zooming (< 10%) so as to increase the robustness of the trained model. The dataset used in this study will be shared in the “AIDA Dataset Register” (https://datasets.aida. medtech4health.se). Network architecture We identified several network structures that had passed benchmark thresholds for large-scale visual recognition through the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) (Russakovsky et al. 2015). For image classification, convolutional networks are the most widely applied deep neural networks. Convolutional networks build upon convolutional layers, involving the application of a set of machine-learned convolution filters to the input image so as to extract image features such as lines, edges, and corners. During image processing, the filter slides across the image and stacks along the channel’s dimension in the convolutional layer (Goodfellow et al. 2016). VGG VGG (Visual Geometry Group) is a classic convolutional neural network (CNN) that consists of stacked convolutional layers, pooling layers, and fully connected layers (Simonyan aand Zisserman 2014), connected sequentially. These layers are the building blocks found commonly in modern CNNs. In the ILSVRC 2014 challenge, VGG was ranked in 2nd place (after the Inception network) for the image classification task and in 1st place for the localization task. We used the 19-layer model due to its high performance. Inception network The Inception network (the first version was also referred to as GoogLeNet) won the ILSVRC 2014 contest with a top-5 error rate of 6.7% (Russakovsky et al. 2015). This is a level close to human perception. Inception consists of 22 convolutional layers with batch normalization. Due to the introduction of inception blocks, each consists of parallel connections of four convolutional pathways. The Inception network uses 20-times fewer parameters than VGG. The computational cost associated with the use of Inception is therefore much lower than that linked to VGG or AlexNet, making it accessible to mobile computing devices with limited computational resources. The Inception architecture has been refined in various ways. We used InceptionV3 (Szegedy et al. 2016), which is 42 layers deep, and the computational cost is only about 2.5-fold higher than that of GoogLeNet (Inception V1), while at the same time being more efficient than VGG. ResNet The residual neural network ResNet won the ILSVRC 2015 contest (He et al. 2016). The ResNet strategy is based on an attempt to solve the problem of degradation with increasing

396

Acta Orthopaedica 2021; 92 (4): 394–400

Automated The radiographs were directly fed into the network with size and intensity normalization steps, as described above. All 3 CNN architectures (VGG19, InceptionV3, and ResNet50) were tested for diagnostic accuracy in relation to the classification of the radiographs to either AFF or NFF. C A

Figure 2. The intervention pathway involves reposition (A), rotation (B), and cropping (C).

depth. The network applies a residual function in a residual network based on the hypothesis that optimizing a residual mapping function is easier to achieve than optimizing the original unreferenced mapping. The network converges faster and gains accuracy with increasing depth (He et al. 2016). In a hyper-parameter searching experiment, we applied ResNet with depths of 18, 36, 50, 101, and 152 layers; ResNet50 showed the best performance for the number of images available in our study. Therefore, only the results obtained from ResNet50 are reported here. Transfer learning Transfer learning is a technique that applies known image data to pre-train a deep neural network by reusing image features from other tasks to learn the classification task at hand and, thereby, increase the performance of the neural network. It allows training of the network through random initialization in a situation of limited data and resources. ImageNet, which is currently the largest publicly available dataset for object recognition, is widely applied in transfer learning (Deng et al. 2009). ImageNet is an open image database that consists of 10,000,000 images collected from the internet. These images depict more than 10,000 object categories and every image is labeled according to the category it belongs to. For pre-training, we use the subset from the ILSVRC 2012 image classification challenge. This subset of images contains 1,000 object categories and 1,200,000 images. Even though these images are different from the radiographic images in our study, it is generally believed that pre-training of the CNN networks on this complicated image classification task will allow the networks to learn a hierarchy of generalizable features. Despite the differences between natural images and radiographs, transfer learning from ImageNet can make medical image recognition tasks more effective (Shin et al. 2016a). The cross-modality imaging transferability makes transfer learning in CNN representation from ImageNet popular in various modalities for imaging recognition. Feeding pathway 2 diagnostic pipelines were constructed using the CNNs as the core classifier, in order to study the influence of user intervention on network performance.

Intervention The radiographs show the femur in different positions in terms of the rotation and position of the radiograph. An automated script (see Supplementary data) was created using the Keras (https://keras.io/api/preprocessing/image/) and OpenCV package (https://opencv.org/) to move the fracture towards the center of the image and to rotate the femur into the vertical position. In addition, all the images were cropped to a size of 256×256 pixels around the center of the fracture (Figure 2). Using this intervention, we attempted to increase the precision of the network in focusing on the features of the fracture. The intervention involves 2 computer mouse clicks per image from the user interface and visual screening to ensure the quality of these images. Evaluation methods We used cross-validation to obtain more accurate results with less bias in the machine learning studies. Theoretically, the dataset is split into K-folds, of which 1 fold is for validation and the other folds are for training. The training and validation processes were repeated several times using different folds each time. The final results were then averaged and the standard deviation (SD) was calculated. We also calculated the diagnostic accuracy, precision, sensitivity, and specificity of each network to discriminate between the 2 types of fractures and present area under the ROC (receiver operating characteristics) curves (AUC) for each network and the 2 feeding pathways. Neural networks typically do not provide insights into the processes within the network that lead to a specific result. We used class activation mapping (CAM) to visualize the features that the network is learning. CAM visualizes a discriminative image region that is used by the network to identify a certain type of fracture (Zhou et al. 2016). Experimental setup During the training process, the batch size was set to 5 and trained for 100 epochs for InceptionV3 and ResNet50. For VGG19, we trained for 200 epochs, since it converged at a slower rate. The learning rate was set to 10–5 for all 3 models as a result of fine-tuning. A stochastic gradient descent (SGD) optimizer was used for VGG19. The Root Mean Square Propagation (RMSprop) and Adaptive Moment Optimization (Adam) optimizers were used for InceptionV3 and ResNet50 (Kingma and Ba 2014). Ethics and funding Ethical approval for the study was obtained from the Swedish Ethical Review authority (Dnr. 2020-01108). JS has received

Acta Orthopaedica 2021; 92 (4): 394–400

397

Accuracy – VGG19

Accuracy – InceptionV3

Accuracy – ResNet50

1.0

0.8

0.6

0.4

0.2

A 0

75 100 125 150 175 200

B 0

Epoch

100

C 0

Epoch

100

Epoch

Figure 3. Accuracy plots for: (A) VGG19; (B) InceptionV3; and (C) ResNet50, as expressed for the automated method. True positive rate – VGG19, auto

True positive rate – InceptionV3, auto

True positive rate – ResNet50, auto

1.0

0.8

0.6

ROC curve of fold1 (AUC = 0.81)

0.4

0.2

ROC curve of fold3 (AUC = 0.83)

ROC curve of fold4 (AUC = 0.82)

ROC curve of fold4 (AUC = 0.77)

ROC curve of fold5 (AUC = 0.85)

ROC curve of fold5 (AUC = 0.80)

0.2

0.4

Mean ROC (AUC = 0.80 ± 0.02)

± 1 SD

0.6

0.8

1.0

0.2

0.4

False positive rate

0.6

ROC curve of fold2 (AUC = 0.89) ROC curve of fold3 (AUC = 0.89) ROC curve of fold4 (AUC = 0.88) ROC curve of fold5 (AUC = 0.86)

0.2

Chance

Mean ROC (AUC = 0.82 ± 0.03)

0.4

ROC curve of fold2 (AUC = 0.79)

ROC curve of fold3 (AUC = 0.79)

Chance

ROC curve of fold1 (AUC = 0.94)

ROC curve of fold1 (AUC = 0.82)

0.4

ROC curve of fold2 (AUC = 0.86)

Chance Mean ROC (AUC = 0.89 ± 0.03) ± 1 SD

0.8

1.0

0.2

0.4

0.6

0.8

1.0

False positive rate

True positive rate – VGG19, intervention

True positive rate – InceptionV3, intervention

True positive rate – ResNet50, intervention

1.0

0.8

0.6

ROC curve of fold1 (AUC = 0.88)

0.4

ROC curve of fold2 (AUC = 0.96)

0.2

ROC curve of fold1 (AUC = 0.88)

0.4

ROC curve of fold3 (AUC = 0.91)

ROC curve of fold4 (AUC = 0.89)

ROC curve of fold5 (AUC = 0.83)

Chance

0.2

0.4

0.6

ROC curve of fold4 (AUC = 0.94) ROC curve of fold5 (AUC = 0.96)

0.2

Chance

Mean ROC (AUC = 0.89 ± 0.04)

± 1 SD

ROC curve of fold2 (AUC = 0.93) ROC curve of fold3 (AUC = 0.95)

Chance

Mean ROC (AUC = 0.89 ± 0.04)

ROC curve of fold1 (AUC = 0.92)

0.4

ROC curve of fold2 (AUC = 0.96)

ROC curve of fold3 (AUC = 0.91)

Mean ROC (AUC = 0.94 ± 0.01)

± 1 SD

0.8

1.0

0.2

0.4

False positive rate

0.6

± 1 SD

0.8

1.0

False positive rate

0.2

0.4

0.6

0.8

1.0

False positive rate

Figure 4. Receiver operating characteristics (ROC) curves for each network and the automated (upper row) and the intervention (lower row) pathway. The intervention pathway of the ResNet50 network shows the lowest rate of false positives at the highest rate of true positives yielding a mean AUC (area under the curve) of 0.94.

institutional support or lecturer’s fees outside of this work from Link Sweden AB, Depuy Synthes, and Sectra.

Results Automated pathway When using standardized, pre-processed input data, the evaluated accuracies for the validation dataset were 91% (SD 1.3), 83% (SD 1.6), and 89% (SD 2.5) for ResNet50, VGG19, and InceptionV3. It took about 4 hours to train 1 fold with 100 epochs each on a personal computer with an Intel Core i7 8700 CPU, 16 GB RAM and an NVIDIA GTX 1070 graphic card.

According to the accuracy plots (Figure 3) and ROC curves (Figure 4, upper row), ResNet not only achieved the highest final diagnostic accuracy, mean AUC value 0.89 (SD 0.03), but it also reached high levels after fewer iterations compared with the competitor networks. The results of the 5-fold crossvalidation are shown in Table 1 and Figure 4 (upper row). Intervention pathway After adjustment of the image alignment and rotation, the diagnostic accuracies increased to 94% (SD 2), 92% (SD 2.7), and 93% (SD 3.7) for ResNet50, VGG19, and InceptionV3, respectively (Table 2). Similar improvements can be seen in the AUC values of the ROC curves (Figure 4, lower row),

398

Acta Orthopaedica 2021; 92 (4): 394–400

Table 1. Cross-validation of the automated method using VGG19, InceptionV3, and ResNet, expressed as percentages and averages with standard deviations (SD) for K-folds

Table 2. Cross-validation for intervention method (manual adjustment of alignment and rotation) using VGG19, InceptionV3, and ResNet expressed as percentages and averages (SD) for K-folds

Accuracy (%) K-fold VGG19 InceptionV3 ResNet

Fold1 81.6 85.3 89.1 Fold2 84.4 89.1 90.7 Fold3 84.4 91.7 90.0 Fold4 81.6 91.1 92.5 Fold5 81.4 90.0 90.0 Average 82.7 89.4 90.5 SD 1.6 2.5 1.3

Fold1 Fold2 Fold3 Fold4 Fold5 Average SD

Table 3. Comparison with Multi-Metrics depicting accuracy, sensitivity, specificity, and precision of discrimination between fracture types expressed as percentages (SD) for the automatedmethod and the intervention method for each network Network

Method

Accuracy Sensitivity Specificity Precision

VGG19 InceptionV3 ResNet50

Automated Interactive Automated Interactive Automated Interactive

82.7 (1.6) 92.2 (2.7) 89.4 (2.5) 93.4 (3.7) 90.5 (1.3) 94.4 (2.0)

85.4 (4.0) 93.0 (2.0) 90.6 (3.4) 92.8 (5.0) 89.0 (3.3) 94.4 (1.5)

79.6 (4.8) 91.6 (5.2) 88.4 (2.6) 94.2 (3.7) 92.2 (3.0) 95.8 (1.9)

81.0 (3.4) 91.6 (5.2) 88.4 (2.6) 94.8 (3.3) 92.2 (3.0) 95.8 (1.9)

90.4 97.0 94.1 94.9 94.3 94.3 91.4 91.7 94.2 95.1 88.0 97.0 89.0 96.2 95.2 92.2 93.4 94.4 2.7 3.7 2.0

specificity (92%, SD 3) and precision (92%; SD 3), whereas the InceptionV3 network showed a slightly higher sensitivity of 91% (SD 3.4), as compared with 89% (SD 3.3) when using the automated pathway. With sensitivity of 94% (SD 1.5), specificity of 96% (SD 1.9), and precision of 96% (SD 1.9), the ResNet50 outperformed all its competitor networks (Table 3). Visualization of the results The final stage was to visualize the features that networks are learning with class activation mapping. Thus, one can observe the discriminative image regions used by the neural networks to identify AFF or NFF. Some examples of the results and potential sources of error in these analyses are shown in Figure 5.

Discussion Our aim was to identify deep neural networks that can discriminate atypical femoral fractures from normal femoral shaft fractures on routine conventional radiographs. ResNet50 outperformed other networks with respect to both diagnostic accuracy and time required for the analysis (Table 3). Our findings are in line with those of previous studies indicating that ResNet outperforms other neural networks in classification tasks (He et al. Figure 5. Attention maps showing areas in the image that are utilized by the network for 2016, Brinker et al. 2018). ResNet takes advanlearning through class activation mapping. Fracture region of AFFs (atypical femoral fractures) (A+B) and NFFs (normal femur fractures) (C+D) are correctly depicted by tage of residual blocks to solve the problem of the network. Focus outside the fracture region (E+F) might lead to misclassification. a vanishing gradient with increasing depth of the neural networks. Moreover, the convergence showing the highest diagnostic accuracy for the ResNet50 speed of ResNet is much higher than that of either Inception or network, mean AUC value 0.9 (SD 0.01). VGG19. In line with our hypothesis, the intervention pathway improved the performances of all the networks. In clinical rouComparison with multi-metrics tine, these interventions could be applied by radiographers or When comparing multiple diagnostic metrics, the ResNet50 other personnel. The smallest increase in benefit was obtained network outperformed the competitor networks in terms of for ResNet, which already showed high accuracy and specific-

Acta Orthopaedica 2021; 92 (4): 394–400

399

Figure 6. Artificial intelligence designed to attract attention to places where attention is needed. Illustrated by Pontus Andersson.

ity and precision > 90% in the automated pathway. However, the minimal intervention effort of making 2 computer mouse clicks resulted in an improvement of on average 4%, which in medical diagnostics represents a relevant improvement. Finally, we found class activation mapping to be an essential qualitative tool for allowing human interpretation of the quantitative results given by the network (Figure 5). Studies on AI applications in medical diagnostics typically aim to challenge the power of the human mind in performing highly specialized tasks or tasks of high volume, where processing times matter. In the present study, we used the network to dichotomize, attempting to increase the radiologist’s sensitivity to a specific and rare fracture pattern among a large volume of normal fractures. In our understanding the purpose of the AI is not to replace the clinician but should provide a technical supplement to increase the likelihood of a correct diagnosis. In similarity to approaches taken previously, we used transfer learning from non-medical images to resolve the issue of a limited number of training images and to improve the performance of our CNN (Kim and MacKinnon 2018). Our results are similar to those of a previous study in which a CNN was used to distinguish 695 cases of wrist fracture from 694 non-fracture cases (Kim and MacKinnon 2018), achieving 95% accuracy. Even when CNNs were applied to classify proximal humeral fractures on plain anteroposterior shoulder radiographs, automated distinction of fractured from non-fractured shoulders was achieved with an accuracy of 96%. However, in the same sample, the CNN showed poorer performance in the classification of different types of fractures, with 65%–86% accuracy (Chung et al. 2018). This leads us to conclude that classifying fracture and non-fracture images is an easier task than distinguishing between 2 types of fractures. This type of task is also challenging for the human mind. Previous classification of shoulder fractures into different types according to a well-established classification system (Neer 1970) yielded only approximately 50% inter-observer reliability and 60% intra-observer reliability (Sidor et al. 1993). The human mind tends to suffer from fatigue when engaging in cognitively demanding, repetitive tasks (Tomei et al. 2006, Ren 2018). Therefore, allowing the CNN to perform the bulk analysis and to bring suspected cases to the attention of the radiologist through the visualization

of discriminative image regions means that the CNN emerges as an attractive tool in this study (Figure 6). This study is the first attempt to use artificial intelligence for the radiographic diagnosis of atypical femoral fractures. We used radiographs from a selected cohort of patients who had suffered a femoral shaft fracture but had no previous fractures, pathologic features, or pre-existing implants. This may explain the excellent results and could limit the applicability in a clinical setting, in which the CNN might be blurred by implants and pathologic features in the bone surrounding the fracture site. In conclusion, we demonstrate that CNNs are a promising tool for the radiographic detection of rare atypical femoral fractures. Given that < 10% of patients are correctly identified on the basis of diagnostic radiographs at the moment, CNNs could contribute potentially as a technical supplement for the clinician, although this is currently limited to the experimental setting. Further training with CNNs and their exposure to a real-world clinical environment are warranted. Supplementary data Intervention script is available as supplementary data in the online version of this article, http://dx.doi.org/10.1080/17453 674.2021.1891512

GZ and YC contributed equally. YC: Study design, data analysis, writing of the manuscript. GZ: Study design, review of radiographs, writing and revision of the manuscript. HPB: Review of radiographs, writing and revision of the manuscript. CW: Study design, data analysis, writing and revision of the manuscript. MW: Study design, writing and revision of the manuscript. JS: Study initiation, study design, review of radiographs, writing and revision of the manuscript. The authors thank Analytic Imaging Diagnostics Arena (AIDA) for funding; Vinnova grant 2017-02447, Region Östergötland ALF grants and the Knut and Alice Wallenberg Foundation for generous support. Our late colleague Per Aspenberg is acknowledged for his valuable support and intellectual contributions. Acta thanks Fabian van de Bunt and Max Gordon for help with peer review of this study.

400

Adams M, Chen W, Holcdorf D, McCusker M W, Howe P D, Gaillard F. Computer vs human: deep learning versus perceptual training for the detection of neck of femur fractures. J Med Imaging Radiat Oncol 2019; 63(1): 27-32. doi: 10.1111/1754-9485.12828. Bogl H P, Michaelsson K, Zdolsek G, Hoijer J, Schilcher J. Increased rate of reoperation in atypical femoral fractures is related to patient characteristics and not fracture type: a nationwide cohort study. Osteoporos Int 2020a; 31(5): 951-9. doi: 10.1007/s00198-019-05249-3. Bogl H P, Zdolsek G, Michaelsson K, Hoijer J, Schilcher J. reduced risk of reoperation using intramedullary nailing with femoral neck protection in low-energy femoral shaft fractures. J Bone Joint Surg Am 2020b; 102(17): 1486-94. doi: 10.2106/JBJS.20.00160. Brett A, Miller C G, Hayes C W, Krasnow J, Ozanian T, Abrams K, Block J E, van Kuijk C. Development of a clinical workflow tool to enhance the detection of vertebral fractures: accuracy and precision evaluation. Spine (Phila Pa 1976) 2009; 34(22): 2437-43. doi: 10.1097/BRS.0b013e3181b2eb69. Brinker T J, Hekler A, Utikal J S, Grabe N, Schadendorf D, Klode J, Berking C, Steeb T, Enk A H, von Kalle C. Skin cancer classification using convolutional neural networks: systematic review. J Med Internet Res 2018; 20(10): e11936. doi: 10.2196/11936. Chung S W, Han S S, Lee J W, Oh K S, Kim N R, Yoon J P, Kim J Y, Moon S H, Kwon J, Lee H J, Noh Y M, Kim Y. Automated detection and classification of the proximal humerus fracture by using deep learning algorithm. Acta Orthop 2018; 89(4): 468-73. doi: 10.1080/17453674.2018.1453714. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. New York: IEEE; 2009. p 248-55. Goodfellow I, Bengio Y, Courville A. Deep learning. Cambridge, MA: MIT Press; 2016. Harborne K, Hazlehurst J M, Shanmugaratnam H, Pearson S, Doyle A, Gittoes N J, Choudhary S, Crowley R K. Compliance with established guidelines for the radiological reporting of atypical femoral fractures. Br J Radiol 2016; 89(1057): 20150443. doi: 10.1259/bjr.20150443. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition 2016. p. 770-8. Kalmet P H S, Sanduleanu S, Primakov S, Wu G, Jochems A, Refaee T, Ibrahim A, Hulst L V, Lambin P, Poeze M. Deep learning in fracture detection: a narrative review. Acta Orthop 2020; 91(2): 215-20. doi: 10.1080/17453674.2019.1711323. Kim D H, MacKinnon T. Artificial intelligence in fracture detection: transfer learning from deep convolutional neural networks. Clin Radiol 2018; 73(5): 439-45. doi: 10.1016/j.crad.2017.11.015. Kingma D P, Ba J. Adam: a method for stochastic optimization. arXiv preprint arXiv:14126980 2014. Lindsey R, Daluiski A, Chopra S, Lachapelle A, Mozer M, Sicular S, Hanel D, Gardner M, Gupta A, Hotchkiss R, Potter H. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci USA 2018; 115(45): 11591-6. doi: 10.1073/pnas.1806905115. Neer C S. 2nd. Displaced proximal humeral fractures, I: Classification and evaluation. J Bone Joint Surg Am 1970; 52(6): 1077-89. Odvina C V, Zerwekh J E, Rao D S, Maalouf N, Gottschalk F A, Pak C Y. Severely suppressed bone turnover: a potential complication of alendronate therapy. J Clin Endocrinol Metab 2005; 90(3): 1294-301. doi: 10.1210/ jc.2004-0952. Odvina C V, Levy S, Rao S, Zerwekh J E, Rao D S. Unusual mid-shaft fractures during long-term bisphosphonate therapy. Clin Endocrinol (Oxf) 2010; 72(2): 161-8. doi: 10.1111/j.1365-2265.2009.03581.x. Olczak J, Fahlberg N, Maki A, Razavian A S, Jilert A, Stark A, Skoldenberg O, Gordon M. Artificial intelligence for analyzing orthopedic trauma radiographs. Acta Orthop 2017; 88(6): 581-6. doi: 10.1080/17453674.2017.1344459. Ren P. Cognitive fatigue in repetitive learning task. Techniques Neurosurg Neurol 2018; 2(1). doi: 10.31031/tnn.2018.02.000527.

Acta Orthopaedica 2021; 92 (4): 394–400

Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention. Berlin: Springer; 2015. p 234-41. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A C, Fei-Fei L. ImageNet large scale visual recognition challenge. Int J Comput Vis 2015; 115(3): 211-52. doi: 10.1007/s11263-015-0816-y. Schilcher J, Michaelsson K, Aspenberg P. Bisphosphonate use and atypical fractures of the femoral shaft. N Engl J Med 2011; 364(18): 1728-37. doi: 10.1056/NEJMoa1010650. Schilcher J, Koeppen V, Ranstam J, Skripitz R, Michaelsson K, Aspenberg P. Atypical femoral fractures are a separate entity, characterized by highly specific radiographic features: a comparison of 59 cases and 218 controls. Bone 2013; 52(1): 389-92. doi: 10.1016/j.bone.2012.10.016. Schilcher J, Koeppen V, Aspenberg P, Michaelsson K. Risk of atypical femoral fracture during and after bisphosphonate use. Acta Orthop 2015; 86(1): 100-7. doi: 10.3109/17453674.2015.1004149. Shane E. Evolving data about subtrochanteric fractures and bisphosphonates. N Engl J Med 2010; 362(19): 1825-7. doi: 10.1056/NEJMe1003064. Shane E, Burr D, Ebeling P R, Abrahamsen B, Adler R A, Brown T D, Cheung A M, Cosman F, Curtis J R, Dell R, Dempster D, Einhorn T A, Genant H K, Geusens P, Klaushofer K, Koval K, Lane J M, McKiernan F, McKinney R, Ng A, Nieves J, O’Keefe R, Papapoulos S, Sen H T, van der Meulen M C, Weinstein RS , Whyte M, American Society for B, Mineral R. Atypical subtrochanteric and diaphyseal femoral fractures: report of a task force of the American Society for Bone and Mineral Research. J Bone Miner Res 2010; 25(11): 2267-94. doi: 10.1002/jbmr.253. Shane E, Burr D, Abrahamsen B, Adler R A, Brown T D, Cheung A M, Cosman F, Curtis J R, Dell R, Dempster D W, Ebeling P R, Einhorn T A, Genant H K, Geusens P, Klaushofer K, Lane J M, McKiernan F, McKinney R, Ng A, Nieves J, O’Keefe R, Papapoulos S, Howe T S, van der Meulen M C, Weinstein R S, Whyte M P. Atypical subtrochanteric and diaphyseal femoral fractures: second report of a task force of the American Society for Bone and Mineral Research. J Bone Miner Res 2014; 29(1): 1-23. doi: 10.1002/jbmr.1998. Shin H C, Roth H R, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers R M. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 2016a; 35(5): 1285-98. doi: 10.1109/ TMI.2016.2528162. Shin J S, Kim N C, Moon K H. Clinical features of atypical femur fracture. Osteoporos Sarcopenia 2016b; 2(4): 244-9. doi: 10.1016/j. afos.2016.08.001. Sidor M L, Zuckerman J D, Lyon T, Koval K, Cuomo F, Schoenberg N. The Neer classification system for proximal humeral fractures: an assessment of interobserver reliability and intraobserver reproducibility. J Bone Joint Surg Am 1993; 75(12): 1745-50. doi: 10.2106/00004623-199312000-00002. Simonyan K, Zisserman A. Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:14091556 2014. Starr J, Tay Y K D, Shane E. Current understanding of epidemiology, pathophysiology, and management of atypical femur fractures. Curr Osteoporos Rep 2018; 16(4): 519-29. doi: 10.1007/s11914-018-0464-6. Szegedy C, Vanhoucke V, Iouffe S, Shlens J, Wojna Z. Rethinking the Inception Architecture for Computer Vision. CVPR 2016 2016. doi: https:// doi.org/10.1109/CVPR.2016.308. Tomei G, Cinti M E, Cerratti D, Fioravanti M. [Attention, repetitive works, fatigue and stress]. Ann Ig 2006; 18(5): 417-29. Urakawa T, Tanaka Y, Goto S, Matsuzawa H, Watanabe K, Endo N. Detecting intertrochanteric hip fractures with orthopedist-level accuracy using a deep convolutional neural network. Skeletal Radiol 2019; 48(2): 239-44. doi: 10.1007/s00256-018-3016-3. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition; 2016. p 2921-9.

Acta Orthopaedica 2021; 92 (4): 401–407

401

Thromboprophylaxis in primary shoulder arthroplasty does not seem to prevent death: a report from the Norwegian Arthroplasty Register 2005–2018 Randi M HOLE 1,2, Anne Marie FENSTAD 1, Jan-Erik GJERTSEN 1,2, Stein A LIE 1,3, and Ove FURNES 1,2 1 Norwegian

Arthroplasty Register, Department of Orthopedic Surgery, Haukeland University Hospital, Bergen; 2 Department of Clinical Medicine, University of Bergen; 3 Department of Clinical Dentistry, University of Bergen, Norway Correspondence: Randi.margrete.hole@helse-bergen.no Submitted 2020-12-08. Accepted 2021-03-01.

Background and purpose — There is still no consensus on whether to use thromboprophylaxis as a standard treatment in shoulder replacement surgery. We investigated the use of thromboprophylaxis reported to the Norwegian Arthroplasty Register (NAR). The primary endpoint was early mortality after primary shoulder arthroplasty with and without thromboprophylaxis. Secondary endpoints included revisions within 1 year and intraoperative complications. Patients and methods — This observational study included 6,123 primary shoulder arthroplasties in 5,624 patients reported to the NAR from 2005 to 2018. Cox regression analyses including robust variance analysis were performed with adjustments for age, sex, ASA score, diagnosis, type of implant, fixation, duration of surgery, and year of primary surgery. An instrumental variable Cox regression was performed to estimate the causal effect of thromboprophylaxis. Results — Thromboprophylaxis was used in 4,089 out of 6,123 shoulder arthroplasties. 90-day mortality was similar between the thromboprophylaxis and no thromboprophylaxis groups (hazard ratio (HR) = 1.1, 95% CI 0.6–2.4). High age (> 75), high ASA class (≥ 3), and fracture diagnosis increased postoperative mortality. No statistically significant difference in the risk of revision within 1 year could be found (HR = 0.6, CI 0.3–1.2). The proportion of intraoperative bleeding was similar in the 2 groups (0.2%, 0.3%). Interpretation — We had no information on cause of death and relation to thromboembolic events. However, no association of reduced mortality with use of thromboprophylaxis was found. Based on our findings routine use of thromboprophylaxis in shoulder arthroplasty can be questioned.

Shoulder arthroplasty (SA) has gained wide acceptance as treatment for a variety of shoulder conditions, and the annual incidence rates are increasing (Lubbeke et al. 2017). Venous thromboembolism (VTE) is a recognized complication after hip and knee arthroplasties (Lie et al. 2002) but has been considered rare after SA. The number of reports of VTE after SA has increased with increasing number of SAs performed (Lyman et al. 2006, Jameson et al. 2011) and fatal outcome has also been reported (Saleem and Markel 2001, Madhusudhan et al. 2009). The true risk of VTE after SA has not been determined, and even though some studies suggest that the risk equals that of lower limb arthroplasty (Willis et al. 2009), most studies find a lower risk in the upper extremities (Isma et al. 2010, Saleh et al. 2013). Chemical thromboprophylaxis reduces the rates of symptomatic VTE following lower limb arthroplasty and is supposed to reduce mortality from thromboembolic complications (Dahl 1998, Senay et al. 2018). Thromboprophylaxis remains controversial among surgeons because it may carry a higher risk of bleeding, wound complication, and reoperation after orthopedic surgery (Kwong et al. 2012). Guidelines on thromboprophylaxis exist in Norway and in other countries (SIGN 2010, Falck-Ytter et al. 2012, Kristiansen et al. 2014, National Institute for Health and Clinical Excellence 2018, Samama et al. 2018). While thromboprophylaxis is recommended for all patients undergoing hip or knee arthroplasties, there are still no evidence-based guidelines specific for SA. Due to the low number of SAs performed and the low rate of deaths due to thromboembolic events, a randomized trial would not be feasible. Hence, the best option to study the effect of thromboprophylaxis is large cohort studies (Fender et al. 1997). Using an observational population-based design with data from the Norwegian

402

Acta Orthopaedica 2021; 92 (4): 401–407

Arthroplasty Register (NAR) we studied the use of thromboprophylaxis in patients undergoing SA. Our primary endpoint was the influence of thromboprophylaxis on 90-day mortality. Secondary endpoints were intraoperative bleeding complications and revision due to all causes and due to infection within 1 year.

intraoperative bleeding complications and revisions during the first year after surgery were also included in the analyses. All 6,972 primary shoulder arthroplasties reported to NAR in the period 2005–2018 were eligible for inclusion in the study. No patients emigrated during the study period. We excluded 849 operations with missing information in one or more of the variables of interest. Finally 6,123 cases were included in the study.

Patients and methods

Statistics Pearson’s chi-square test was used for comparison of categorical variables. Survival time for the 2 subgroups of patients was calculated using Kaplan–Meier estimates. Endpoint was death of any cause within 90 days. Cox regression analyses were used to calculate hazard ratios (HRs) for postoperative deaths and risk of revision between patients receiving thromboprophylaxis and those not receiving prophylaxis, with adjustments for possible confounding of age, sex, ASA score, diagnosis, type of implant (anatomic total, reversed, or hemiarthroplasty), fixation (cemented or uncemented humerus stem), duration of surgery, and year of surgery. Bilateral cases were treated in the descriptive part as if they were independent, while the adjusted HRs were calculated using robust variance estimates to account for bilateral SAs. Calculation of the robust variance estimates follows the counting process formula of Andersen and Gill (Andersen and Gill 1982, Therneau and Grambsch 2000). As an alternative to the adjusted Cox regression, we estimated the causal effect of thromboprophylaxis using an instrumental variable (IV) approach. This analysis follows the methods described by MacKenzie et al. (2014) for IVs in a Cox regression model using the statistical package R (R Foundation for Statistical Computing, Vienna, Austria). As instrument, we applied the hospital’s annual propensity for using thrombosis prophylaxis. Hence, the IV approach assumes that the hospital is related to the mortality only through the use of thrombosis prophylaxis, and that the hospital is independent of unobserved covariates. Under these conditions the estimated HR can be interpreted as a causal HR of thrombosis prophylaxis on mortality. All tests were 2-sided and p-values below 0.05 were considered statistically significant. Follow-up started on the day of the primary arthroplasty and ended on the date of death or at 90 days for the mortality analyses and at 1 year after surgery for the revision analyses. All analyses were repeated stratifying on age, sex, ASA classification, diagnosis, and arthroplasty type in order to study the potential differences in effect of thromboprophylaxis on outcomes in subgroups of patients. Analyses were performed using the package IBM SPSS statistics version 24.0 (IBM Corp, Armonk, NY, USA) and the statistical package R Version 4.0.0 (R Foundation, Vienna, Austria).

This study was performed according to the Reporting of studies Conducted using the Observational Routinely collected health Data (RECORD) checklist. The NAR started collecting data on shoulder arthroplasties in 1994. All hospitals in Norway performing SAs report to the register (Fevang et al. 2009). After each operation the surgeon fills in a 1-page paper form, which includes details on the surgical procedure and implants with catalogue numbers. In addition, the form includes information on age, sex, indication for operation, duration of surgery, and intraoperative complications including major bleeding. From 2005 information also includes details on chemical thromboprophylaxis and comorbidity according to the ASA classification. The completeness of reporting of primary SAs in the NAR was 95% for primary operations compared with the Norwegian Patient Registry in 2017–2018 (Furnes et al. 2020). All patients operated on with SA in the period studied were included regardless of the cause for operation. Rheumatoid arthritis, psoriatic arthritis, ankylosing spondylitis, seronegative arthritis, and systemic lupus erythematosus were grouped together and categorized as inflammatory arthritis. Several diagnoses could be given for each operation, and in cases with more than 1 diagnosis we used the hierarchy developed by the Nordic Arthroplasty Register Association (NARA) (Rasmussen et al. 2016). The NAR uses the unique personal ID given to each inhabitant of Norway to link the primary shoulder arthroplasty to subsequent revisions and reoperations. Revisions and reoperations are reported equivalent to the primary operation. A revision is defined as the insertion, exchange, or extraction of any of the prosthesis components while a procedure without insertion, exchange, or extraction of components is registered as a reoperation. Multiple reasons for revision can be marked on the form. In cases with more than 1 reason for revision the hierarchy developed by the NARA group was used to determine 1 main reason for revision. Reoperations without the exchange or extraction of components were reported to the register from 2011. In our dataset there were no reported reoperations. The NAR was linked to the National Population Register and information on death and emigration was available for all patients. Deaths in the first 90 days after surgery were defined as primary outcome, as deaths after this period were considered less likely to be related to the index procedure. Reported

Acta Orthopaedica 2021; 92 (4): 401–407

Table 1. Patient and procedure characteristics at primary shoulder arthroplasties relative to thromboprophylaxis or no thromboprophylaxis reported to the Norwegian Arthroplasty Register 2005–2018

403

Annual number of shoulder arthroplasty 900 800

Thromboprophylaxis Factor No Yes p-value

Thromboprophylaxis No thromboprophylaxis

700 600

Number of procedures 2,034 (33) 4,089 (67) Women 1,429 (70) 2,846 (70) 0.6 a Mean age at surgery (SD) 70.7 (10.8) 70.9 (10.7) 0.6 b Age group 0.07 a ≤ 64 years 568 (28) 1,035 (25) 65–74 years 697 (34) 1,484 (36) ≥ 75 years 769 (38) 1,570(38) ASA class 0.02 a 1–2 1,329 (65) 2,546 (62) 3–4 705 (35) 1,543 (38) Arthroplasty type < 0.001 a TSA 355 (18) 887 (22) RSA 816 (40) 1,957 (48) SHA 623 (31) 965 (24) Other 240 (12) 280 (7) Diagnosis 0.01 a Primary arthritis 740 (36) 1,584 (39) Acute fracture 583 (29) 1,068 (26) Fracture sequelae 294 (15) 578 (14) Rotator cuff arthropathy 193 (9.5) 323 (7.9) Inflammatory arthritis 156 (7.7) 357 (8.7) Other 68 (3.3) 179 (4.4) Duration of surgery in minutes, mean (SD) 109 (42) 114 (37) < 0.001 b Fixation of stem < 0.001 a Cemented 895 (44) 2,396 (59) Uncemented 1,139 (56) 1,693 (41) TSA = total shoulder arthroplasty, RSA = reverse shoulder arthroplasty, SHA = stemmed hemiarthroplasty. a Pearson’s chi-square test; b Student’s t-test.

Ethics, funding, and potential conflicts of interests The NAR has permission from the Norwegian Data Inspectorate to collect patient data based on written consent from the patients (ref 24.1.2017: 16/01622-3/CDG). The Norwegian Arthroplasty Register is financed by the Western Norway health authorities. The authors declare no conflict of interest.

Results 4,089 cases received thromboprophylaxis and 2,034 did not receive thromboprophylaxis. Low molecular weight heparin (LMWH) was the dominant medication used. 2,778 patients were treated with dalteparin and 1,201 patients with enoxaparin (68% and 29% of the patients receiving thromboprophylaxis respectively). Patient and procedure characteristics for the 2 groups are shown in Table 1. The patients receiving thromboprophylaxis had statistically significantly higher mean ASA class and longer mean duration of surgery. Patients operated on with a reverse shoulder arthroplasty (RSA) more frequently received thromboprophylaxis compared with patients operated on with

500 400 300 200 100 0

2006

2008

2010

2012

2014

2016

2018

Year

Figure 1. Change in the use of thromboprophylaxis over time, Norwegian Arthroplasty Register 2005–2018. Annual number of shoulder arthroplasty 900 800 700

Stemmed hemiarthroplasty Total shoulder arthroplasty Reverse shoulder arthroplasty Other shoulder arthroplasty

600 500 400 300 200 100 0

2006

2008

2010

2012

2014

2016

2018

Year

Figure 2. Change in the use of different arthroplasty design over time, Norwegian Arthroplasty Register 2005–2018.

stemmed hemiarthroplasty (SHA) and total shoulder arthroplasty (TSA) (p < 0.001). There was an increase in the use of thromboprophylaxis over time in the period studied (Figure 1). The use of hemiarthroplasty dominated in the earlier years of this period, and the use of RSAs and TSAs increased in the later years (Figure 2). Risk of death We identified 50 deaths within 90 days in the period studied, 35 in the thromboprophylaxis group and 15 in the group with no thromboprophylaxis (Figure 3). Adjusted HR showed no significant difference between the two groups (HR 1.2; CI 0.6–2.2) with the no thromboprophylaxis group as reference. Using the IV approach, we found a non-significant causal effect of thromboprophylaxis on 90-day mortality (HR 1.1; CI 0.6–2.4) (Table 2).

404

Acta Orthopaedica 2021; 92 (4): 401–407

Death within 90 days (%)

Revision within 1 year (%)

Infection within 1 year (%)

2.0

Thromboprophylaxis No thromboprophylaxis

4 1.5

1.5

3 1.0

1.0 2

0.5

0.5 1

0.0

0 0

Days from surgery

Figure 3. Kaplan–Meier curve showing the death rate up to 90 days after surgery in patients with and without thromboprophylaxis with 95% CI.

Months from surgery

Figure 4. Kaplan–Meier curve showing the revision rate due to all causes (A) and due to infection (B) up to 1 year with 95% CI.

Table 2. Kaplan–Meier (K–M) estimated risk of death at 90 days: shoulder arthroplasties reported to the Norwegian Arthroplasty Register 2005–2018 No thromboprophylaxis Thromboprophylaxis

Deaths at 90 days 15 35

At risk K–M % deaths Adjusted at 90 days (95% CI) HR (95% CI) 1,928 3,859

0.8 (0.4–1.2) 0.9 (0.7–1.1)

1 1.2 (0.6–2.1)

IV adjusted HR (95% CI) 1 1.1 (0.6–2.4)

Cox adjusted hazard ratio (HR) with robust variance estimates adjusted for age, sex, ASA class, diagnosis, arthroplasty type, use of cement in humerus, duration of surgery and time period. IV = Instrument variable approach.

Compared with patients with primary osteoarthritis, patients with acute fractures had a higher 90-day mortality (HR 3.4; CI 1.2–9.5). A similar tendency was found for patients with sequelae after fracture, but the difference was not statistically significant. Patients with rotator cuff arthropathy or inflammatory arthritis did not have increased 90-day mortality compared with patients with primary osteoarthritis (Table 3, see Supplementary data). We found higher 90-day mortality after fracture-related surgery (acute fracture and fracture sequelae) than after non-fracture-related surgery 1.6% (CI 1.0–2.2) vs. 0.3% (CI 0.1–0.5). Old age (> 75 years), high ASA class (≥ 3), and acute fracture diagnosis statistically significantly increased 90-day mortality. The risk of death was not significantly changed in the different time periods studied (Table 3, see Supplementary data). ASA classification and age Since both increasing ASA class and high age increased mortality, we also performed Cox regression analysis with patients stratified into 3 different risk groups, dependent on both age (≥ 80 based on the Norwegian guidelines for thromboprophy-

laxis) and ASA classification. This analysis suggested an even stronger correlation between age, ASA class, and the risk of death. We found no statistically significant difference in the distribution of thromboprophylaxis in the different risk groups and use of thromboprophylaxis did not alter the risk of death at 90 days (Table 4, see Supplementary data). Revision risk There were 155 revisions within the first year. Of these, 29 revisions were performed due to deep infections (16 in the thromboprophylaxis group and 13 in the no thromboprophylaxis group). 62 revisions were due to loosening of 1 or more of the components without deep infection recorded. Risks of revision of any cause (HR 0.8; CI 0.6–1.1) and for infection (HR 0.6; CI 0.3–1.2) were similar between the study groups (Table 5, see Supplementary data, Figure 4). No reoperations were recorded. Intraoperative complications 182 intraoperative complications were registered. Extensive intraoperative bleeding was reported in 17 cases, 12 in the thromboprophylaxis group (0.3%) and 5 in the no throm-

Acta Orthopaedica 2021; 92 (4): 401–407

boprophylaxis group (0.2%). Only 3 of the 12 patients with extensive bleeding in the thromboprophylaxis group had preoperative initiation of the thromboprophylaxis.

Discussion Our main finding was that there was no association between the use of thromboprophylaxis and the risk of death in the postoperative period. As expected, we found that high age, high ASA class, and fracture diagnosis (acute fracture and fracture sequelae) increased the 90-day mortality. Earlier studies on thromboprophylaxis in shoulder arthroplasty surgery include fewer patients, and even though the number of deaths in our study is low the incidence is comparable to earlier studies. Thromboprophylaxis after shoulder surgery is still a controversial issue: the national guidelines in Norway and other countries are vague. The guidelines in the United Kingdom (National Institute for Health and Clinical Excellence 2018) recommends that the surgeon “Consider VTE prophylaxis for people undergoing upper limb surgery if the person’s total time under general anaesthetic is over 90 minutes or where their operation is likely to make it difficult for them to mobilise.” Based on these recommendations the vast majority of shoulder arthroplasty patients will require thromboprophylaxis. However, VTE events are rare after planned shoulder surgery (0.01–0.5%) (Lyman et al. 2006, Jameson et al. 2011, Navarro et al. 2013). In the study from Jameson et al. (2011) the 90-day mortality rates after planned shoulder surgery were low (0.03–0.5%), and no change in the mortality rate after the introduction of the 2007 NICE guidelines could be found. Our results with 0.3% 90-day mortality in non-fracture SA surgery support Jameson’s findings. VTE events are more common in the proximal humerus fracture setting (0.4–1.7%) (Navarro et al. 2013) but compared with other orthopedic procedures the risk is still low (Dahl et al. 2003). In a large cohort study from Young et al. (2015) proximal humerus fracture, anemia, congestive heart failure, and chronic lung disease were 4 independent predictors for PE after shoulder arthroplasty. As expected, increasing age, fracture diagnosis, and high ASA class correlated with increased mortality in our cohort. We found increased use of thromboprophylaxis in shoulder arthroplasty surgeries during the studied period. The Norwegian guidelines (Kristiansen et al. 2014) indicating that thromboprophylaxis should be used have probably led to some hospitals changing their use of prophylaxis, but some hospitals were not consistent in their use of prophylaxis. This may be explained by surgeon’s preference or by lack of routines. It could also reflect diversified treatment where patients considered at risk are given thromboprophylaxis. The use does not seem to correlate with the patient’s ASA class, but the ASA

405

class does not fully account for risk factors like previous DVT or other predisposing factors and may therefore not necessarily be a good measure of the actual risk of VTE and mortality. In our study the use of different arthroplasty types changed during the period studied and some of the differences in the use of thromboprophylaxis in different arthroplasties can be explained by the change of indications for the arthroplasty type. The lack of consensus on the use of prophylaxis in shoulder replacement surgery is reflected by our data, where some hospitals seem to give thromboprophylaxis as a routine and others do not. Some hospitals perform more elective surgery and have more rheumatoid patients while other perform more fracture surgery, and this may also influence the hospital’s routines for thromboprophylaxis. The cost-effectiveness of daily injections of LMWH has to be considered. It is inconvenient for the patient and resource demanding for the healthcare system if patients cannot administer the injections themselves, and there are potential complications. However, we found no difference in intraoperative bleeding complications between the 2 groups and the use of thromboprophylaxis did not seem to affect the risk of revision due to infection. Kwong et al. (2012) found insufficient data in the literature to confirm or refute the hypothesis that postoperative bleeding due to VTE prophylaxis in hip and knee arthroplasty contributes to increased risk for wound infection. Navarro et al. (2013) observed no difference in 90-day mortality by procedure type (reverse shoulder arthroplasties, total shoulder arthroplasties, or hemiarthroplasties), but a higher mortality in trauma patients compared with elective in his retrospective database review from the Kaiser Permanente registry. In our cohort we found increased risk of mortality in the acute fracture setting, and use of thromboprophylaxis did not alter this risk. Navarro found that only 1 of the 13 deaths observed in his study could be attributed to complications of PE, and this indicates that this is a fragile group of patients with several comorbidities and increased risk of death. In accordance with this we found increased risk of death in the acute fracture group and also higher age in this group. By dividing patients into risk groups and combining the ASA classification with age, Dale et al. (2020) showed that high-risk patients had nearly 9 times the risk of adjusted perioperative death after primary total hip arthroplasty compared with low-risk patients. In our study the use of thromboprophylaxis did not alter the risk of death within 90 days in any of the risk groups. This does not support the routine use of thromboprophylaxis to prevent death. The bilateral observations in register studies can be dealt with in different ways (Ranstam et al. 2011). Also, Lie et al. (2004) studied the influence of bilateral hip arthroplasties on survival analyses and concluded that in analyses of arthroplasty survival dependencies should be considered, but ignoring the possible dependencies does not necessarily have an impact on the result. We performed Cox regression analyses

406

with robust variance analyses to account for the bilateral cases and found only small differences (statistically non-significant) between unadjusted and adjusted risk of death. Using an instrument variable analysis approach to estimate the causal effect of thrombosis prophylaxis confirmed the results from the standard analysis. Strengths and limitations This is a nationwide observational cohort study from the Norwegian Arthroplasty Register. The strengths of a register study are the large number of patients and the possibility to study rare events. All hospitals performing shoulder arthroplasties in Norway are reporting to the register and the completeness of reporting primary cases is 95% (Furnes et al. 2020). Information on death and migration was available from Statistics Norway, allowing for nationwide cohort studies with complete follow-up. We do not, however, have access to the cause of death or readmissions to hospital due to VTE or bleeding in these patients. Lie et al. (2002) studied 67,000 hip arthroplasties and early postoperative mortality by linkage to the cause of death registry. They found that vascular causes of death were commonest, with the subcategory thromboembolic complications as the most frequent cause. Even though we do not have access to cause of death in our material we might assume that thromboembolic complications are also a common cause of death in shoulder arthroplasty surgery. This is confirmed in a study from the Danish Shoulder Arthroplasty Registry (Amundsen et al. 2016). They reported the 90-day mortality and the reasons for death between 2006 and 2012. In their study, approximately 30% of deaths were reported with a cardiac or pulmonary cause. In light of the results from Amundsen’s study we can assume that the number of deaths related to thromboembolic events in our study, with only 35 and 15 deaths in the 2 groups, were low and probably insufficient to make any clear recommendations. The use of thromboprophylaxis as a standard method of treatment varies among hospitals. This might influence the result, as different surgeons may have different results. The instrumental variable analysis accounted for these differences by applying the hospitals’ propensity for using thromboprophylaxis in the model and the results from the standard analysis were confirmed. An intraoperative bleeding complication was recorded only if the surgeon considered it to be extensive, and the amount of bleeding was not recorded. The completeness of the registration of complications has not been investigated. The findings regarding intraoperative complications must hence be interpreted with caution, and the incidence of such complications is most likely higher than reported. Until 2011, reoperation due to bleeding or hematoma was not reported to the register unless a revision of the prosthesis was also performed. From 2011 all reoperations should be reported to the register, but the completeness of this registration is not known and may be underreported.

Acta Orthopaedica 2021; 92 (4): 401–407

Conclusion The use of thromboprophylaxis does not seem to reduce the overall low mortality and the use of thromboprophylaxis as a routine in shoulder arthroplasty surgery to prevent thromboembolic complications leading to death can be discussed. We cannot exclude that subgroups of patients with a high risk of VTE, such as earlier VTE events, may benefit from thromboprophylaxis. Supplementary data Tables 3–5 are available as supplementary data in the online version of this article, http://dx.doi.org/10.1080/17453674.2021. 1906595

RMH performed the analyses of the data and wrote the manuscript. AMF and SAL contributed with statistical advice. All authors contributed to the conception and design of the study, critical analyses of the data, interpretation of the findings, and critical revision of the manuscript through all stages of the study. The authors would like to thank the Norwegian surgeons for their contribution by thoroughly reporting to the NAR and the staff at the NAR for their meticulous punching and statistical advice. Acta thanks Lars Adolfsson and Jeppe Vejlgaard Rasmussen for help with peer review of this study.

Amundsen A, Rasmussen J V, Olsen B S, Brorson S. Mortality after shoulder arthroplasty: 30-day, 90-day, and 1-year mortality after shoulder replacement—5853 primary operations reported to the Danish Shoulder Arthroplasty Registry. J Shoulder Elbow Surg 2016; 25(5): 756-62. doi: 10.1016/j.jse.2015.09.020. Andersen P K, Gill R. Cox’s regression model for counting processes: a large sample study. Annals of Statistics 1982; 10(4): 1100-20. Dahl O E. Thromboprophylaxis in hip arthroplasty: new frontiers and future strategy. Acta Orthop Scand 1998; 69(4): 339-42. doi: 10.3109/17453679808999042. Dahl O E, Gudmundsen T E, Bjornara B T, Solheim D M. Risk of clinical pulmonary embolism after joint surgery in patients receiving lowmolecular-weight heparin prophylaxis in hospital: a 10-year prospective register of 3,954 patients. Acta Orthop Scand 2003; 74(3): 299-304. doi: 10.1080/00016470310014229. Dale H, Borsheim S, Kristensen T B, Fenstad A M, Gjertsen J E, Hallan G, Lie S A, Furnes O. Perioperative, short-, and long-term mortality related to fixation in primary total hip arthroplasty: a study on 79,557 patients in the Norwegian Arthroplasty Register. Acta Orthop 2020; 91(2): 152-8. doi: 10.1080/17453674.2019.1701312. Falck-Ytter Y, Francis C W, Johanson N A, Curley C, Dahl O E, Schulman S, Ortel T L, Pauker S G, Colwell C W Jr. Prevention of VTE in orthopedic surgery patients: antithrombotic therapy and prevention of thrombosis, 9th ed: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest 2012; 141(2 Suppl.): e278S-e325S. doi: 10.1378/chest.11-2404. Fender D, Harper W M, Thompson J R, Gregg P J. Mortality and fatal pulmonary embolism after primary total hip replacement: results from a regional hip register. J Bone Joint Surg Br 1997; 79(6): 896-9. Fevang B T, Lie S A, Havelin L I, Skredderstuen A, Furnes O. Risk factors for revision after shoulder arthroplasty: 1,825 shoulder arthroplasties from the Norwegian Arthroplasty Register. Acta Orthop 2009; 80(1): 83-91. doi: 10.1080/17453670902805098.

Acta Orthopaedica 2021; 92 (4): 401–407

Furnes O, Hallan G, Visnes H, Gundersen T, Kvinnesland I A, Fenstad A M, Dybvik E, Kroken G. Annual report: the Norwegian Advisory Unit on Arthroplasty and Hip Fractures; 2020. doi: 10.13140/RG.2.2.18876.46727 Isma N, Svensson P J, Gottsater A, Lindblad B. Upper extremity deep venous thrombosis in the population-based Malmo thrombophilia study (MATS): epidemiology, risk factors, recurrence risk, and mortality. Thromb Res 2010; 125(6): e335-8. doi: 10.1016/j.thromres.2010.03.005. Jameson S S, James P, Howcroft D W, Serrano-Pedraza I, Rangan A, Reed M R, Candal-Couto J. Venous thromboembolic events are rare after shoulder surgery: analysis of a national database. J Shoulder Elbow Surg 2011; 20(5): 764-70. doi: 10.1016/j.jse.2010.11.034. Kristiansen A, Brandt L, Berge E, Dahm A E, Halvorsen S, Sandset P M, Vandvik P O. [New guidelines for antithrombotic therapy and thromboprophylaxis]. Tidsskr Nor Laegeforen 2014; 134(9): 921-2. doi: 10.4045/tidsskr.13.1582. Kwong L M, Kistler K D, Mills R, Wildgoose P, Klaskala W. Thromboprophylaxis, bleeding and post-operative prosthetic joint infection in total hip and knee arthroplasty: a comprehensive literature review. Expert Opin Pharmacother 2012; 13(3): 333-44. doi: 10.1517/14656566.2012.652087. Lie S A, Engesaeter L B, Havelin L I, Furnes O, Vollset S E. Early postoperative mortality after 67,548 total hip replacements: causes of death and thromboprophylaxis in 68 hospitals in Norway from 1987 to 1999. Acta Orthop Scand 2002; 73(4): 392-9. doi: 10.1080/00016470216312. Lie S A, Engesaeter L B, Havelin L I, Gjessing H K, Vollset S E. Dependency issues in survival analyses of 55,782 primary hip replacements from 47,355 patients. Stat Med 2004; 23(20): 3227-40. doi: 10.1002/sim.1905. Lubbeke A, Rees J L, Barea C, Combescure C, Carr A J, Silman A J. International variation in shoulder arthroplasty. Acta Orthop 2017; 88(6): 592-9. doi: 10.1080/17453674.2017.1368884. Lyman S, Sherman S, Carter T I, Bach P B, Mandl L A, Marx R G. Prevalence and risk factors for symptomatic thromboembolic events after shoulder arthroplasty. Clin Orthop Relat Res 2006; 448:152-6. doi: 10.1097/01. blo.0000194679.87258.6e. MacKenzie T A, Tosteson T D, Morden N E, Stukel T A, O’Malley A J. Using instrumental variables to estimate a Cox’s proportional hazards regression subject to additive confounding. Health Serv Outcomes Res Methodol 2014; 14(1-2): 54-68. doi: 10.1007/s10742-014-0117-x. Madhusudhan T R, Shetty S K, Madhusudhan S, Sinha A. Fatal pulmonary embolism following shoulder arthroplasty: a case report. J Med Case Rep 2009; 3: 8708. doi: 10.4076/1752-1947-3-8708. National Institute for Health and Clinical Excellence. Venous thromboembolism in over 16s: reducing the risk of hospital-acquired deep vein

407

thrombosis or pulmonary embolism (NG89). Available at: www.nice.org. uk/guidance/ng89. [Guidelines] 2018. Navarro R A, Inacio M C, Burke M F, Costouros J G, Yian E H. Risk of thromboembolism in shoulder arthroplasty: effect of implant type and traumatic indication. Clin Orthop Relat Res 2013; 471(5): 1576-81. doi: 10.1007/s11999-013-2829-6. Ranstam J, Käarrholm J, Pulkkinen P, Mäkelä K, Espehaug B, Pedersen A B, Mehnert F, Furnes O, NARA study group. Statistical analysis of arthroplasty data, II: Guidelines. Acta Orthop 2011; 82(3): 258-67. doi: 10.3109/17453674.2011.588863. Rasmussen J V, Brorson S, Hallan G, Dale H, Aarimaa V, Mokka J, Jensen S L, Fenstad A M, Salomonsson B. Is it feasible to merge data from national shoulder registries? A new collaboration within the Nordic Arthroplasty Register Association. J Shoulder Elbow Surg 2016; 25(12): e369-e77. doi: 10.1016/j.jse.2016.02.034. Saleem A, Markel D C. Fatal pulmonary embolus after shoulder arthroplasty. J Arthroplasty 2001; 16(3): 400-3. doi: 10.1054/arth.2001.20546. Saleh H E, Pennings A L, ElMaraghy A W. Venous thromboembolism after shoulder arthroplasty: a systematic review. J Shoulder Elbow Surg 2013; 22 (10): 1440-8. doi: 10.1016/j.jse.2013.05.013 Samama C M, Afshari A, ESA VTE Guidelines Task Force. European guidelines on perioperative venous thromboembolism prophylaxis. Eur J Anaesthesiol 2018; 35(2): 73-6. doi: 10.1097/EJA.0000000000000702. Senay A, Trottier M, Delisle J, Banica A, Benoit B, Laflamme G Y, Malo M, Nguyen H, Ranger P, Fernandes J C. Incidence of symptomatic venous thromboembolism in 2372 knee and hip replacement patients after discharge: data from a thromboprophylaxis registry in Montreal, Canada. Vasc Health Risk Manag 2018; 14: 81-9. doi: 10.2147/VHRM.S150474. SIGN (Scottish Intercollegiate Guidelines Network). Prevention and management of venous thromboembolism. Edinburgh: SIGN; 2010. (SIGN publication no 122). [cited 10 Dec 2010]. http://www.sign.ac.uk. [Guidelines] 2010. Therneau T M, Grambsch P M. Modeling survival data: extending the Cox model. (ISBN 978-1-4757-3294-8). New York: Springer; 2000. Willis A A, Warren R F, Craig E V, Adler R S, Cordasco F A, Lyman S, Fealy S. Deep vein thrombosis after reconstructive shoulder arthroplasty: a prospective observational study. J Shoulder Elbow Surg 2009; 18(1): 100-6. doi: 10.1016/j.jse.2008.07.011 Young B L, Menendez M E, Baker D K, Ponce B A. Factors associated with in-hospital pulmonary embolism after shoulder arthroplasty. J Shoulder Elbow Surg 2015; 24(10): e271-8. doi: 10.1016/j.jse.2015.04.002.

408

Acta Orthopaedica 2021; 92 (4): 408–412

Prior hip arthroscopy does not affect 1-year patient-reported outcomes following total hip arthroplasty: a register-based matched case-control study of 675 patients Ida LINDMAN 1, Jonatan NÅTMAN 2, Axel ÖHLIN 1,3, Karin SVENSSON MALCHAU 1,3, Louise KARLSSON 1, Maziar MOHADDES 1,3, Ola ROLFSON 1,3, and Mikael SANSONE 1,3 ¹ Department of Orthopedics, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg; 2 Centre of Registers, Västra Götalandsregionen, Gothenburg; 3 Department of Orthopedics, Sahlgrenska University Hospital, Gothenburg, Sweden Correspondence: Lindman91@hotmail.com Submitted 2020-10-29. Accepted 2021-01-22.

Background and purpose — Femoroacetabular impingement syndrome (FAIS) is a common cause of hip pain and may contribute to the development of osteoarthritis. We investigated whether a prior hip arthroscopy affects the patient-reported outcomes (PROMs) of a later total hip arthroplasty (THA). Patients and methods — Patients undergoing hip arthroscopy between 2011 and 2018 were identified from a hip arthroscopy register and linked to the Swedish Hip Arthroplasty Register (SHAR). A propensity-score matched control group without a prior hip arthroscopy, based on demographic data and preoperative score from the EuroQoL visual analogue scale (EQ VAS) and hip pain score, was identified from SHAR. The group with a hip arthroscopy (treated group) consisted of 135 patients and the matched control group comprised 540 patients. The included PROMs were EQ-5D and EQ VAS of the EuroQoL group, and a questionnaire regarding hip pain and another addressing satisfaction. Rate of reoperation was collected from the SHAR. The follow-up period was 1 year. Results — The mean interval from arthroscopy to THA was 27 months (SD 19). The EQ-5D was 0.81 and 0.82, and EQ VAS was 78 and 79 in the treated group and the matched control group respectively. There were no differences in hip pain, and reported satisfaction was similar with 87% in the treated group and 86% in the matched control group. Interpretation — These results offer reassurance that a prior hip arthroscopy for FAIS does not appear to affect the short-term patient-reported outcomes of a future THA and indicate that patients undergoing an intervention are not at risk of inferior results due to their prior hip arthroscopy.

Femoroacetabular impingement syndrome (FAIS) implies abnormal morphology on the femoral or acetabular side of the hip joint and is a common cause of hip pain and dysfunction in the young population (Matar et al. 2019, Zhou et al. 2020). It reportedly increases the risk of developing osteoarthritis (OA), presumably due to damage to the chondrolabral structures (Ganz et al. 2003, Beck et al. 2005). Arthroscopic treatment of FAIS has been proven successful with 1- and 5-years’ follow-up (Griffin et al. 2018, Ohlin et al. 2020). However, one of the most common reoperations is conversion to a total hip arthroplasty (THA) (Harris et al. 2013). Depending on the follow-up period and severity of chondrolabral damages, 3–50% of patients with a previous hip arthroscopy for FAIS are reported to undergo THA later in life (Harris et al. 2013). Whether a prior hip arthroscopy affects the result of a subsequent THA (Haughom et al. 2016, Charles et al. 2017, Perets et al. 2017, Hoeltzermann et al. 2019, Vovos et al. 2019) has previously been discussed. However, many of these studies have been underpowered and the results have been incongruent. Most studies suggested no differences in outcomes in THA for patients with a prior hip arthroscopy (Haughom et al. 2016, Charles et al. 2017, Hoeltzermann et al. 2019). Yet inferior patient satisfaction and higher complication rates were reported in some studies (Perets et al. 2017, Vovos et al. 2019). To optimize the results for patients undergoing THA surgery, it is important to understand factors that could affect the outcomes. The possible effect of hip arthroscopy on future THA should also be considered during patient selection. We investigated the influence of a prior hip arthroscopy on a subsequent THA with patient-reported outcome measures (PROMs) 1 year after THA.

Acta Orthopaedica 2021; 92 (4): 408–412

Primary THA registered in SHAR 2011–2018 n = 104,379

THA with prior hip arthroscopy n = 166

409

Hip arthroscopy registered in local register 2011–2018 n = 2,516

THA with no prior hip arthroscopy n = 104,213

Excluded Missing data n = 31

Excluded (32,322): – not primary OA, 14,810 – missing data, 17,512 Total control group n = 71,891

Total treated group n = 135

1:4 propensity-scorematched group n = 540

Flow chart of included patients. Excluded diagnoses: tumors, fractures, or trauma. Excluded missing data due to missing preoperatively patient-reported outcomes or demographic data. Abbreviations: SHAR: Swedish Hip Arthroplasty Register, THA: total hip arthroplasty.

Patients and methods Patients Data were retrieved from a local hip arthroscopy register covering procedures due to FAIS undertaken at 2 hospitals (Sahlgrenska University Hospital, Mölndal and Orthocenter Gothenburg, Sweden) between 2011 and 2018 (Sansone et al. 2014). Based on the unique personal identity number given to all permanent residents in Sweden, this data was linked to the Swedish Hip Arthroplasty Register (SHAR). The SHAR is a national quality register with 98% completeness of registrations for primary THA (Rolfson et al. 2011). There were 135 patients with a subsequent THA in the same hip as the hip arthroscopy identified (treated group). From the SHAR, an overall control group of 71,891 patients with a THA between 2011 and 2018 was created. Only the first operated hip was included in bilaterally operated patients. Patients with a fracture or a tumor as an indication for THA were excluded from the overall control group (Figure). From the overall control group, a 1:4 matched control group with no history of previous hip arthroscopy was further identified. Outcome measures 3 different PROMs were used: the EQ-5D-3L health status questionnaire of the EuroQoL-group, a 5-level Likert scale addressing hip pain and a 5-level Likert scale addressing satisfaction with the outcome of THA. These PROMs are part of the routine follow-up program in the SHAR (Rolfson et al. 2011). The EQ-5D index ranges from –0.59 to 1, where 0 is a health state equivalent to death, 1 is equivalent to perfect health, and negative values are considered worse than death (Dolan and Roberts 2002). The EQ-5D further includes a 0 to 100 visual analogue scale (VAS) covering general health (EQ VAS) ranging from 0 to 100. To define the minimal important

difference (MID) of the EQ VAS, an improvement of 15 points was used (King 2011). The hip pain Likert scale ranges from 1 (no pain) to 5 (severe pain). The satisfaction item ranges from 1 (very dissatisfied) to 5 (very satisfied). This scale was dichotomized into satisfied (including very satisfied and satisfied) and dissatisfied (very dissatisfied, dissatisfied, and neither satisfied nor dissatisfied). Preoperative and 1-year followup PROM data were included in the study. The rate of reoperation was collected from the SHAR. Demographic data Demographic data, such as age at the time of THA surgery, sex, BMI, ASA classification, and type of prosthesis fixation was collected from the SHAR. Propensity score matching and control group For comparisons between potential differences in PROMs at follow-up, a 1:4 propensity-score-matched group was included. The variables included in the propensity matching were age, sex, BMI, ASA classification, type of prothesis fixation, preoperative EQ VAS score, and preoperative hip pain. The treated group consisted of 135 patients with a prior hip arthroscopy, while the matched control group consisted of 540 patients with a THA due to primary OA. To describe and to recognize dissimilarities in the baseline characteristics, demographic data was compared between the treated group and the overall control group of 71,891 patients, as demographic data was included in the propensity match. Missing data for any of the variables included in the propensity score matching was handled by listwise deletion, meaning that patients with missing data on 1 or more of the variables were excluded from the analysis. Statistics All statistical analyses were performed using R version 3.6.1 (R Centre for Statistical Computing, Vienna, Austria). The outcomes of the 2 groups were compared using the 2-sample t-test for continuous variables and by Pearson’s chi-square test of independence for categorical variables. Results are reported with mean (SD), p-values and 95% confidence intervals (CI). Fisher’s exact test was used for comparing rate of reoperations between the 2 groups. Statistical significance was set at p < 0.05. Matching was performed using the 1:4 nearest-neighbor matching without replacement. Each treated patient was matched to 4 patients not treated with arthroscopy. The ability of the propensity-score matching to balance the baseline covariates was assessed using absolute standardized mean differences (SMD). An SMD of < 10% was considered non-significant. For continuous variables, the variances were also compared. Ethics, funding, and potential conflicts of interest The study was approved by the Swedish Ethical Review Authority (number 2019-04682). This study was not financed

410

Acta Orthopaedica 2021; 92 (4): 408–412

Table 1. Demographic data presented as numbers (%) or mean values and standard deviations (SD) with standardized mean difference (SMD) between matched control group and treated group

Overall Matched control Treated control SMD n = 71,891 n = 135 n = 540 (%)

Female sex ASA classification 1 2 3–4 Type of prothesis Cemented Hybrid Uncemented Mean age (SD) Mean BMI (SD)

40,341 (56)

51 (38)

188 (35)

6.2

17,703 (24) 43,079 (60) 11,109 (16)

95 (70) 36 (27) 4 (3)

384 (71) 137 (25) 19 (4)

4.1

42,359 (59) 12,405 (17) 17,127 (24) 68 (11) 27 (5)

9 (7) 10 (7) 116 (86) 51 (8) 27 (3)

32 (6) 44 (8) 464 (86) 52 (10) 27 (4)

4.0 3.0 0

Table 2. Patient-reported outcome for the treated group and the matched control group presented as the mean value, confidence interval (CI), and p-value for Eq-5D, and percentage for hip pain and satisfaction Variable

Matched Difference control Treated 95% CI

p-value

EQ-5D index Preoperative 0.35 0.34 –0.05 to 0.07 0.8 1 year postoperative 0.82 0.81 –0.04 to 0.05 0.9 Delta 0.47 0.47 –0.07 to 0.06 0.9 EQ VAS Preoperative 51.9 52.4 –5.0 to 3.9 0.8 1 year postoperative 78.8 78.3 –3.2 to 4.2 0.8 Delta 26.9 25.9 –4.1 to 6.1 0.7 Hip pain 1 year postoperatively (%) None 55 55 0.7 Very mild 23 21 Mild 14 15 Moderate 7 7 Severe 1 2 Satisfaction 1 year postoperatively (%) Not satisfied 14 13 0.9 Satisfied 86 87

by any external funding. The authors declare no conflicts of interest.

Results

Discussion

In the treated group, 62% were male with a mean age of 51 years, compared with 44% male and a mean age of 68 years in the overall control group (Table 1). The mean time interval to THA from hip arthroscopy was 27 months (SD 19). There were no statistically significant differences between the treated group and the matched control group with regard to the EQ-5D or the EQ VAS (Table 2). For the patients undergoing a prior hip arthroscopy, 68% experienced an improvement of 15 points or more on the EQ VAS compared with 65% of the patients in the matched control group. There was no statistically significant difference between the patients regarding hip pain postoperatively (Table 2). In the treated group and in the matched control group, 94% and 96% of the patients had improved by at least 1 point respectively. 1 year after surgery, 87% of the treated group and 86% of the matched control group were satisfied with surgery (Table 2). In the treated group, 2 patients were reoperated on during the study period and in the matched control group 19 patients were reoperated on. The reasons for reoperation in the treated group were technical reasons and complications with the implant in the first patient and a deep infection in the second patient, and in the control group the reasons were aseptic loosening (6 patients), deep infection (8 patients), fracture (3 patients), and dislocation (2 patients). The difference regarding the rate of reoperations between the 2 groups was statistically not significant (p = 0.3).

The most important finding in this study was that there were similar patient-reported outcomes between patients undergoing THA after a hip arthroscopy compared with patients without a prior hip arthroscopy. Our findings are similar to those in previous studies, where most report no differences between study groups and control groups at follow-up. A systematic review recently concluded that the short-term outcomes for patients with THA and prior arthroscopy are comparable to those patients undergoing a primary THA; however, many of the included studies were underpowered (Rosinsky et al. 2019). Further, Rosinsky et al. (2020) have reported on the longest follow-up period of 5 years so far. They found no differences in terms of PROMs in a study group comprising 33 patients, yet reported a slightly higher risk of revision for patients who had undergone a prior hip arthroscopy. While Haughom et al. (2016) acknowledged that their study was underpowered for conclusions regarding reoperation, they found the Harris Hip Score (HHS) to be higher preoperatively in the group who had prior hip arthroscopy, while there were no differences in HHS at follow-up. Conversely, Hoeltzermann et al. (2019) found the opposite in terms of the modified HHS (mHHS), which was lower preoperatively in the study group of 33 patients, but neither did they find any differences at follow-up between the group with a prior hip arthroscopy compared with a THA without a prior arthroscopy. Charles et al. (2017) found no differences in postoperative outcomes in a study group comprising 39 patients. Although most studies report no differences between groups, Perets et al. (2017) reported inferior results

Acta Orthopaedica 2021; 92 (4): 408–412

in terms of a lower HHS, a lower Forgotten Joint Score-12 (FJS-12). and patient satisfaction 2 years after THA surgery in a study group of 35 patients. Further, Vovos et al. (2019), with the largest cohort previously reported, found increased surgical time and increased intraoperative and postoperative complications in a study group of 95 patients with prior hip arthroscopy; however no differences were found regarding revision rates after 2 years’ follow-up. In this study, the number of revisions were few and a prior hip arthroscopy did not increase the risk of reoperation. We compared demographic data between patients with a history of ipsilateral hip arthroscopy prior to their THA and the overall control group in the SHAR. The group of patients with a prior hip arthroscopy were younger and consisted of more men than the overall control group. The larger proportion of men in the group who had undergone a prior hip arthroscopy is not unexpected, as FAIS is more common in men and a larger proportion of men undergo hip arthroscopy for FAIS (Sansone et al. 2014). One possible theory relating to the younger age of the group with a prior hip arthroscopy is that FAIS is a contributory factor to the development of OA, thereby leading to the need for an earlier THA (Beck et al. 2005). It is still not understood whether the surgical trauma implied by the hip arthroscopy increases or prevents the risk of developing OA and subsequently undergoing a THA. Rhon et al. (2019) found that 22% of the patients undergoing hip arthroscopy had received a clinical diagnosis of OA within 2 years of arthroscopic surgery. In spite of this, it is not known whether these patients would have developed OA regardless of their primary arthroscopic treatment. Femoroacetabular impingement syndrome is thought to increase the risk of developing osteoarthritis (OA) (Ganz et al. 2003, Beck et al. 2005). However, a study by Wyles et al. (2017) found that the natural history of hips with femoroacetabular impingement morphology was similar to that in hips with normal morphology in terms of the risk of receiving a THA. It has further been discussed whether arthroscopy for FAIS with concomitant OA could prevent the development of OA and the need for a THA or increase its progression (Ng et al. 2010, Domb et al. 2017). Most studies report improved clinical outcomes for patients undergoing hip arthroscopy for FAIS with concomitant OA (Sansone et al. 2016). However, the indication of hip arthroscopy for OA is debated (Kemp et al. 2015). Nevertheless, patients with severe OA and higher age at the time of hip arthroscopy have been shown to have inferior outcomes and a higher risk of undergoing a THA (Kemp et al. 2015). A previous study found a conversion rate of 68% within 2 years and an increased risk of revision and reoperation in patients undergoing hip arthroscopy, though the indication for the patients in that study was OA (Malahias et al. 2020). The main indication for all patients undergoing arthroscopic surgery in our study was FAIS. As the indications for arthroscopy evolve, it is important that the indication for surgery is carefully considered. Based

411

on the findings in this study, undergoing hip arthroscopy for the diagnosis of FAIS prior to a THA will not negatively affect the outcome of the THA. In accordance with previous literature, the short-term outcomes after THA are similar for patients with a prior hip arthroscopy. To our knowledge, this study has the largest study cohort reported, with 135 patients in the treatment group who were compared, after a 1:4 matching, with 540 control patients. The careful matching procedure including both demographic data and PROMs regarding hip pain and general health adds to the strength of the study. Furthermore, the SHAR has a high response rate covering 98% of all THAs performed in Sweden. However, there are limitations to this study. The study does not include intraoperative findings or surgical time. Nor does the study include the specific grade of OA prior to hip arthroscopic surgery; however, the indication for hip arthroscopic surgery was not OA in any patient. The local hip arthroscopy register includes patients undergoing a hip arthroscopy in the western part of Sweden, generating a possible risk of patients in the matched control group having undergone a prior hip arthroscopy in other parts of the country not covered by this register. Further, patients were excluded prior to the propensity-score matching due to missing data. There is always a risk of bias with missing data, but these patients should not be considered as dropouts as they were not fulfilling the requirement for inclusion in this study. Another limitation is that no sample size calculation was performed prior to the study, as all the patients who underwent hip arthroscopy prior to their THA included in the SHAR were included in this study. Although the cohort is larger than that in previous studies evaluating hip arthroscopy prior to THA, there is still a risk of type 2 error. This study reports outcomes 1 year after undergoing a THA and it would be interesting to follow the cohort for a longer period. In conclusion, prior hip arthroscopy for FAIS does not appear to affect the patient-reported outcomes of a future THA. In the decision to undergo hip arthroscopy, these results offer reassurance that such an intervention is not likely to influence patient-reported outcomes after an eventual future THA and indicate that patients are not at risk of inferior results due to their prior hip arthroscopy.

IL, AÖ, KS, MM, OR, and MS planned and performed the study. JN performed the statistical analysis. IL made the first draft of the manuscript and then received contributions from all co-authors. Acta thanks Arild Aamodt and Thomas Kalteis for help with peer review of this study.

Beck M, Kalhor M, Leunig M, Ganz R. Hip morphology influences the pattern of damage to the acetabular cartilage: femoroacetabular impingement as a cause of early osteoarthritis of the hip. J Bone Joint Surg Br 2005; 87(7): 1012-8. doi: 10.1302/0301-620X.87B7.15203.

412

Charles R, LaTulip S, Goulet J A, Pour A E. Previous arthroscopic repair of femoro-acetabular impingement does not affect outcomes of total hip arthroplasty. Int Orthop 2017; 41(6): 1125-9. doi: 10.1007/s00264-016-3330-0. Dolan P, Roberts J. Modelling valuations for Eq-5d health states: an alternative model using differences in valuations. Med Care 2002; 40(5): 442-6. doi: 10.1097/00005650-200205000-00009. Domb B G, Chaharbakhshi E O, Rybalko D, Close M R, Litrenta J, Perets I. Outcomes of hip arthroscopic surgery in patients with Tonnis Grade 1 osteoarthritis at a minimum 5-year follow-up: a matched-pair comparison with a Tonnis Grade 0 control group. Am J Sports Med 2017; 45(10): 2294-302. doi: 10.1177/0363546517706957. Ganz R, Parvizi J, Beck M, Leunig M, Notzli H, Siebenrock K A. Femoroacetabular impingement: a cause for osteoarthritis of the hip. Clin Orthop Relat Res 2003; (417): 112-20. doi: 10.1097/01.blo.0000096804.78689.c2. Griffin D R, Dickenson E J, Wall P D H, Achana F, Donovan J L, Griffin J, Hobson R, Hutchinson C E, Jepson M, Parsons N R, Petrou S, Realpe A, Smith J, Foster N E, Group F A S. Hip arthroscopy versus best conservative care for the treatment of femoroacetabular impingement syndrome (UK FASHIoN): a multicentre randomised controlled trial. Lancet 2018; 391(10136): 2225-35. doi: 10.1016/S0140-6736(18)31202-9. Harris J D, McCormick F M, Abrams G D, Gupta A K, Ellis T J, Bach B R Jr, Bush-Joseph C A, Nho S J. Complications and reoperations during and after hip arthroscopy: a systematic review of 92 studies and more than 6,000 patients. Arthroscopy 2013; 29(3): 589-95. doi: 10.1016/j. arthro.2012.11.003. Haughom B D, Plummer D R, Hellman M D, Nho S J, Rosenberg A G, Della Valle C J. Does hip arthroscopy affect the outcomes of a subsequent total hip arthroplasty? J Arthroplasty 2016; 31(7): 1516-18. doi: 10.1016/j. arth.2016.01.008. Hoeltzermann M, Sobau C, Miehlke W, Zimmerer A. Prior arthroscopic treatment for femoro-acetabular impingement does not compromise hip arthroplasty outcomes: a matched-controlled study with minimum twoyear follow-up. Int Orthop 2019; 43(7): 1591-6. doi: 10.1007/s00264-01904330-0. Kemp J L, MacDonald D, Collins N J, Hatton A L, Crossley K M. Hip arthroscopy in the setting of hip osteoarthritis: systematic review of outcomes and progression to hip arthroplasty. Clin Orthop Relat Res 2015; 473(3): 1055-73. doi: 10.1007/s11999-014-3943-9. King M T. A point of minimal important difference (MID): a critique of terminology and methods. Expert Rev Pharmacoecon Outcomes Res 2011; 11(2): 171-84. doi: 10.1586/erp.11.9. Malahias M A, Gu A, Richardson S S, De Martino I, Sculco P K, McLawhorn A S. Hip arthroscopy for hip osteoarthritis is associated with increased risk for revision after total hip arthroplasty. Hip Int 2020:1120700020911043. doi: 10.1177/1120700020911043. Online ahead of print. Matar H E, Rajpura A, Board T N. Femoroacetabular impingement in young adults: assessment and management. Br J Hosp Med (Lond) 2019; 80(10): 584-8. doi: 10.12968/hmed.2019.80.10.584.

Acta Orthopaedica 2021; 92 (4): 408–412

Ng V Y, Arora N, Best T M, Pan X, Ellis T J. Efficacy of surgery for femoroacetabular impingement: a systematic review. Am J Sports Med 2010; 38(11): 2337-45. doi: 10.1177/0363546510365530. Ohlin A, Ahlden M, Lindman I, Jonasson P, Desai N, Baranto A, Ayeni O R, Sansone M. Good 5-year outcomes after arthroscopic treatment for femoroacetabular impingement syndrome. Knee Surg Sports Traumatol Arthrosc 2020; 28(4): 1311-6. doi: 10.1007/s00167-019-05429-y. Perets I, Mansor Y, Mu B H, Walsh J P, Ortiz-Declet V, Domb B G. Prior Arthroscopy leads to inferior outcomes in total hip arthroplasty: a matchcontrolled study. J Arthroplasty 2017; 32(12): 3665-8. doi: 10.1016/j. arth.2017.06.050. Rhon D I, Greenlee T A, Sissel C D, Reiman M P. The two-year incidence of hip osteoarthritis after arthroscopic hip surgery for femoroacetabular impingement syndrome. BMC Musculoskelet Disord 2019; 20(1): 266. doi: 10.1186/s12891-019-2646-5. Rolfson O, Kärrholm J, Dahlberg L E, Garellick G. Patient-reported outcomes in the Swedish Hip Arthroplasty Register: results of a nationwide prospective observational study. J Bone Joint Surg Br 2011; 93(7): 867-75. doi: 10.1302/0301-620X.93B7.25737. Rosinsky P J, Kyin C, Shapira J, Maldonado D R, Lall A C, Domb B G. Hip arthroplasty after hip arthroscopy: are short-term outcomes affected? A systematic review of the literature. Arthroscopy 2019; 35(9): 2736-46. doi: 10.1016/j.arthro.2019.03.057. Rosinsky P J, Chen J W, Shapira J, Maldonado D R, Lall A C, Domb B G. Mid-term patient-reported outcomes of hip arthroplasty after previous hip arthroscopy: a matched case-control study with a minimum 5-year follow-up. J Am Acad Orthop Surg 2020; 28(12): 501-10. doi: 10.5435/ JAAOS-D-19-00459. Sansone M, Ahlden M, Jonasson P, Thomee C, Sward L, Baranto A, Karlsson J, Thomee R. A Swedish hip arthroscopy registry: demographics and development. Knee Surg Sports Traumatol Arthrosc 2014; 22(4): 774-80. doi: 10.1007/s00167-014-2840-9. Sansone M, Ahlden M, Jonasson P, Thomee C, Sward L, Collin D, Baranto A, Karlsson J, Thomee R. Outcome of hip arthroscopy in patients with mild to moderate osteoarthritis: a prospective study. J Hip Preserv Surg 2016; 3(1): 61-7. doi: 10.1093/jhps/hnv079. Vovos T J, Lazarides A L, Ryan S P, Kildow B J, Wellman S S, Seyler T M. Prior hip arthroscopy increases risk for perioperative total hip arthroplasty complications: a matched-controlled study. J Arthroplasty 2019; 34(8): 1707-10. doi: 10.1016/j.arth.2019.03.066. Wyles C C, Heidenreich M J, Jeng J, Larson D R, Trousdale R T, Sierra R J. The John Charnley Award: Redefining the natural history of osteoarthritis in patients with hip dysplasia and impingement. Clin Orthop Relat Res 2017; 475(2): 336-50. doi: 10.1007/s11999-016-4815-2. Zhou J, Melugin H P, Hale R F, Leland D P, Bernard C D, Levy B A, Krych A J. The prevalence of radiographic findings of structural hip deformities for femoroacetabular impingement in patients with hip pain. Am J Sports Med 2020; 48(3): 647-53. doi: 10.1177/0363546519896355.

Acta Orthopaedica 2021; 92 (4): 413–418

413

Dislocation of hemiarthroplasty after hip fracture is common and the risk is increased with posterior approach: result from a national cohort of 25,678 individuals in the Swedish Hip Arthroplasty Register Ammar JOBORY 1, Johan KÄRRHOLM 2,3, Susanne HANSSON 1, Kristina ÅKESSON 1, and Cecilia ROGMARK 1,2 1 Department of Orthopaedics, Lund University, Skåne University Hospital, Malmö; 2 Swedish Hip Arthroplasty Register, Registercentrum Gothenburg; 3 Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Sweden

Västra Götaland,

Correspondence: ammar.jobory@med.lu.se Submitted 2020-04-02. Accepted 2021-03-01.

Background and purpose — Reported revision rates due to dislocation after hemiarthroplasty span a wide range. Dislocations treated with closed reduction are rarely reported despite the fact that they can be expected to constitute most of the dislocations that occur. We aimed to describe the total dislocation rate on the national level, and to identify risk factors for dislocation. Patients and methods — We co-processed a national cohort of 25,678 patients in the Swedish Hip Arthroplasty Register, with the National Patient Register (NPR) and Statistics Sweden. Dislocation was defined as the occurrence of any ICD-10 or procedural code related to hip dislocation recorded in the NPR, with a minimum of 1-year-follow-up. In theory, all early dislocations should thereby be traced, including those treated with closed reduction only. Results — 366/13,769 (2.7%) patients operated on with direct lateral approach dislocated, compared with 850/11,834 (7.2%) of those with posterior approach. Posterior approach was the strongest risk factor for dislocation (OR = 2.7; 95% CI 2.3–3.1), followed by dementia (OR = 1.3; CI 1.1–1.5). The older the patients, the lower the risk of dislocation (OR = 0.98 per year of age; CI 0.98–1.0). Neither bipolar design nor cementless stems influenced the risk. Interpretation — The choice of posterior approach and dementia was associated with an increased dislocation risk. When hips treated with closed reduction were identified, the frequency of dislocation with use of direct lateral and posterior approach more than doubled and tripled, respectively, compared with when only revisions due to dislocation are measured.

Displaced femoral neck fractures in elderly patients have traditionally been treated with hemiarthroplasty (HA). Dislocation of the prosthesis is a major complication, affecting 1.5–15% of patients (Enocson et al. 2008, Figved et al. 2009, Leonardsson et al. 2012, Bensen et al. 2014, Parker 2015, Svenoy et al. 2017). The varying rate may be explained by different surgical approach, follow-up time, age, and frailty of the patients. In addition, dislocation may be defined and reported in various ways, for example closed reduction, revision surgery, or both. A systematic review of 7 randomized trials, with a mix of approaches and 1–5 years’ follow-up time, reported a risk of revision due to dislocation of 3% (Burgers et al. 2012). Only open surgery due to dislocation (i.e., open reduction or revision) is reported in the Swedish Hip Arthroplasty (SHAR). By including closed reduction with a linkage to the National Patient Register (NPR) the under-reporting of dislocation can be highlighted. Risk factors for dislocation can be divided into surgically related, implant-related, and patient-related factors. Posterior approach is a known surgically related risk factor (Varley and Parker 2004, Enocson et al. 2008, Leonardsson et al. 2012, Abram and Murray 2015, Svenoy et al. 2017). The risk is even higher if complete posterior repair is not performed (Enocson et al. 2008, Kim et al. 2016, Svenoy et al. 2017). Others are discrepancy of offset (Madanat et al. 2012, Mukka et al. 2015, Li et al. 2016) and, for elective THA, faulty positioning of the stem (McCollum and Gray 1990). Gjertsen et al. (2012) showed increased risk of revision because of dislocation if an uncemented technique was used compared with cementation, while other studies concluded no such association (Varley and Parker 2004, Figved et al. 2009, Abram and Murray 2015). The influence of the prosthetic design, uni- or bipolar head, on the risk of reoperation or dislocation in hip fracture patients

414

is unclear. Several studies found no difference (Varley and Parker 2004, Enocson et al. 2008, 2012), while Leonardsson et al. (2012) showed increased risk of reoperation caused by dislocation with bipolar HA. For fracture patients, 2 studies (Li et al. 2016, Kristoffersen et al. 2020) reported dementia to increase the risk of dislocation while others (Ninh et al. 2009, Madanat et al. 2012, Abram and Murray 2015, Svenoy et al. 2017) did not. Neurological disease (Li et al. 2016) and dysplasia (Madanat et al. 2012, Mukka et al. 2015) are reported patient-related risk factors, whereas age, sex, and comorbidity do not seem to be associated with risk of dislocation (Enocson et al. 2008, 2012, Madanat et al. 2012, Abram and Murray 2015, Kim et al. 2016, Mukka et al. 2015, Svenoy et al. 2017). The influence of other possible confounders such as socioeconomic factors on the risk of dislocation have not, to our knowledge, been studied earlier. We aimed to describe the total dislocation rate on a national level and to explore risk factors with possible influence on the dislocation rate.

Patients and methods SHAR is a national quality register for hip replacement operations in Sweden. SHAR has a coverage of 100% for all hospitals performing joint replacement surgery in Sweden, both public and private. Since 2005 hemiarthroplasties have also been reported with a completeness of approximately 96%. The completeness for reporting revisions (both HA and THA) is approximately 93% (Kärrholm et al. 2019). Open, but not closed, reductions of dislocations are reported to SHAR. In the Swedish NPR, the Swedish National Board of Health and Welfare has collected data on diseases, surgical treatments, and medical care measures since 2001 (Ludvigsson et al. 2011). This includes all inpatients, both public and private hospitals, outpatient visits including day surgery, and psychiatric care from both private and public caregivers. Primary care is not yet covered in the NPR. Statistics Sweden (SCB) (2018) is responsible for official statistics in Sweden. SCB develops, produces, and disseminates statistics on Swedish residents and provides socioeconomic data, factors that can interact with both treatment decisions and outcome. As identified in SHAR, hemiarthroplasty was used to treat 25,678 patients with acute hip fracture during 2005–2011 and these were included in this observational cohort study. Aided by the unique personal identity number given to each Swedish citizen, either at birth or on immigration, individuals can be cross-matched to NPR and SCB. ICD-10 codes were used for main and secondary diagnosis (WHO 2016). For procedural codes, NOMESCO codes (NOMESCO 2011) were used. The codes used to define hip prosthesis dislocation were M24.3-4, M24.4F, S73.0, T93.3, and all NOMESCO codes related to hip arthroplasty dislocation. Information concerning education and marital status was extracted from SCB.

Acta Orthopaedica 2021; 92 (4): 413–418

Hemiarthroplasty after hip fracture in the Swedish Hip Arthroplasty Register 2005–2012 n = 38,086 Excluded (n = 12,408): – not operated in 2005–2011, 4,986 – not acute hip fracture, 4,653 – second hip fracture, 2,051 – duplicate entries, 718 Hemiarthroplasty after acute hip fracture 2005–2011 n = 25,678

Excluded due to missing data (n = 4,044): (overlap between groups) – regarding dementia, 3,945 – regarding surgical approach, 111 – regarding education, 27 – regarding marital status, 27 – regarding stem, 1 Included in logistic regression analysis n = 21,634

Flowchart for the study.

A data set was created from the registers including patients with hip fracture treated with HA during 2005–2012. Since the majority of the dislocations occur within the 1st months after hemiarthroplasty surgery (Madanat et al. 2012, Gill et al. 2018), we decided on a minimum of 1-year follow-up, i.e., operations during 2012 were excluded. Only the 1st hip fracture surgery was included in patients with a 2nd contralateral hip surgery. Statistics Possible risk factors and revision caused by dislocation were calculated using a chi-square test. The Elixhauser index was regrouped into 4 categories (0, 1, 2, and 3+) before analysis. Logistic regression analysis was used to evaluate the mutually adjusted effect of possible risk factors. This method was chosen because, due to the complexity of the final co-processed dataset, we had information only on whether a dislocation had occurred, not the date for such an event. The risk factors included in the logistic regression analysis were age, sex, surgical approach, cementation, prosthesis design (uni-/ bipolar), dementia, Elixhauser in four categories, education, and civil status. Dementia was classified in SHAR as “none,” “suspect,” and “clear” cognitive impairment, based on a judgment of the patient and previous journals. In our analysis, “suspect” and “clear” were grouped together. 4,044 patients were excluded from this analysis because of missing data, mainly on dementia (Figure 1). 95% confidence interval is abbreviated as CI in the Results section. We used IBM SPSS Statistics 24 as statistical software (IBM Corp, Armonk, NY, USA). Ethics, funding, and potential conflicts of interest The study was approved by the regional Ethical Review Board in Gothenburg, Sweden (271-14). The study adhered to the STROBE (Strengthening the Reporting of Observational

Acta Orthopaedica 2021; 92 (4): 413–418

Table 1. Patient characteristics and dislocation rate. Values are count (%) Dislocation No Yes n = 24,458 (95) n = 1,220 (5) p-value Age < 0.001 < 75 2,101 (94) 135 (6) 75–85 11,577 (95) 605 (5) > 85 10,780 (96) 480 (4) Sex 0.09 Female 17,347 (95) 838 (5) Male 7,111 (95) 382 (5) Surgical approach (missing 111) < 0.001 Posterior 10,984 (93) 850 (7) Direct lateral 13,403 (97) 366 (3) Stem (missing 1) 0.7 Cemented 23,718 (95) 1,181 (5) Uncemented 739 (95) 39 (5) Head < 0.001 Unipolar 12,463 (96) 529 (4) Bipolar 11,995 (95 691 (5) Dementia (missing 3,945) 0.005 No 14,357 (96) 647 (4) Yes 6,380 (95 349 (5) Elixhauser 0.08 0 15,345 (95) 733 (5) 1 4,009 (95) 232 (5) 2 2,534 (95) 133 (5) 3 2,570 (96) 122 (4) Education (missing 27) 0.3 Primary school 14,777 (95) 766 (5) High school 6,519 (95) 311 (5) University 3,135 (96) 143 (4) Civil status (missing 27) 0.7 Partner 9,312 (95) 471 (5) Alone 15,119 (95) 749 (5)

Studies in Epidemiology) guidelines (von Elm et al. 2007). Due to Swedish legislation, data on individual study objects cannot be shared. This work was supported by grants from the Southern Health Care Region, and Swedish Research Council funding for clinical research in medicine, Sweden. No competing interests were declared.

415

Table 2. Logistic regression model (n = 21,634) Factor

OR (95% CI)

Age (per year) 0.98 (0.98–1.0) Male sex 1.1 (0.97–1.3) Posterior approach 2.7 (2.3–3.1) Cemented stem fixation 0.83 (0.58–1.2 Unipolar head 0.93 (0.82–1.1) Dementia 1.3 (1.1–1.5) Elixhauser (ref. 0) 1 1.2 (0.97–1.4) 2 1.1 (0.86–1.3) 3 1.0 (0.82–1.3) Education (ref. primary school) high school 0.93 (0.80–1.1) university 1.0 (0.81–1.2) Civil status (partner) 0.92 (0.80–1.1)

Using logistic regression analysis, posterior approach was found to be the most pronounced risk factor for dislocation (OR = 2.7; CI 2.3–3.1) (Table 2). Higher age was associated with a lower risk of dislocation (OR = 0.98 per year of age; CI 0.98–1.0). In the 21,733 patients with complete data on cognitive status, dementia was associated with increased risk (OR = 1.3; CI 1.1–1.5). In a subgroup analysis of 6,709 patients with suspected or manifest cognitive impairment, 129/3,962 (3.3%) patients with direct lateral approach dislocated, compared with 220/2,747 (8.0%) of those with posterior approach. Sex, education, or marital status choice of uni- or bipolar design and type of fixation had no statistically significant influence on the risk of dislocation. Subgroup analysis of prosthetic design showed that 32% of the unipolar (3,917/12,423) and 59% of the bipolar (7,031/11,928) hemiarthroplasties were operated with posterior approach. In most of the patients the stem had been fixed with bone cement (24,899/25,677 patients, 97%).

Discussion Results Overall, the rate of dislocation was 1,220/25,678 (4.8%). Patients treated with posterior approach had a dislocation rate of 850/11,834 (7.2 %) compared with 366/13,769 (2.7%) with direct lateral approach (Table 1). The overall frequency of HA revisions caused by dislocation as reported to SHAR was 1.6% (403/25,678). Patients treated with posterior approach had a frequency of 2.0% (241/11,834) and those treated with direct lateral approach a frequency of 1.2% (162/13,769; p < 0.001). Thus, the relative share of dislocations increased from 2.0% to 7.2% with use of a posterior incision and from 1.2% to 2.7% with use of a lateral approach when hips treated with closed reduction only were included.

With a dislocation rate of 7% after posterior approach and 3% after direct lateral, our findings are on a par with earlier observations of increased risk dislocation with posterior approach compared with direct lateral approach (Enocson et al. 2008, Madanat et al. 2012, Abram and Murray 2015, Svenoy et al. 2017), but with the strength of a much larger cohort on a national level. Both our analysis and earlier studies showed the posterior approach to be the most important risk factor for dislocation (Varley and Parker 2004, Enocson et al. 2008, Abram and Murray 2015, Svenoy et al. 2017, Gill et al. 2018). Use of tendon-to-bone repair may reduce the risk of dislocation after a posterior approach (Enocson et al. 2008, Kim et al. 2016). Such a posterior repair has been practiced for many years in Sweden (Enocson et al. 2008), but its use is

416

not reported to the SHAR. Most probably the risk of dislocation with use of the posterior approach as observed by us can be improved with use of posterior repair performed by experienced hip surgeons. These optimum conditions are, however, usually not present when hemiarthroplasty is performed as everyday practice in Sweden. Both residents and consultants do emergency hip fracture surgery. According to the Swedish Fracture Register (2018), approximately 25% of hemiarthroplasties are performed by residents, 20% by highly specialized trauma consultants, less than 5% by highly specialized arthroplasty consultants, and the rest by other consultants. The direct lateral approach may be associated with other potential shortcomings such as lateral pain and a positive Trendelenburg test. To what extent this is clinically relevant for individuals with hip fracture or should be weighed against the increased dislocation risk observed after use of a posterior approach remains uncertain (Palan et al. 2009, Leonardsson et al. 2013, Kristensen et al. 2017, Mukka et al. 2017). We found an overall dislocation rate of 5%. The SHAR reports only open treatment of a dislocation, i.e., not closed reduction in the emergency room or operating theatre. We found the revision rate caused by dislocation to be 1.6%. The outcome measure “revision due to dislocation” clearly underestimates the clinical problem with dislocation, as only 1 in 3 individuals with dislocation(s) had revision surgery. That 2/3 of patients with a dislocating hemiarthroplasty suffer recurrent dislocations (Madanat et al. 2012, Gill et al. 2018) underlines the problem with revision as outcome for fracture patients. Dislocation is not an absolute indication for revision surgery and exchange of implant parts. The risk of open surgery is weighed against the risk of forthcoming dislocations and the discomfort of the patients. Patients may be advised against secondary open surgery, or decline the proposal themselves. Nevertheless, repeated dislocations treated with closed reduction are usually devastating for the patient as reflected in reduced health-related quality of life (Enocson et al. 2009) and also result in additional hospital costs (Sanchez-Sotelo et al. 2006). We found dementia to be associated with an increased risk of dislocation. This is in line with Li et al. (2016), but in contrast to other studies (Ninh et al. 2009, Madanat et al. 2012, Abram and Murray 2015, Kim et al. 2016, Svenoy et al. 2017). All these studies comprise smaller patient groups than observed by us and may lack statistical power. Individuals with either manifest dementia or suspicion of cognitive impairment showed increased risk of dislocation with the posterior approach compared with the direct lateral approach. Although commonly recommended, movement precautions and mandatory use of ADL equipment during the recovery phase do not affect the risk of dislocation when patients are operated on with the direct lateral approach (Jobory et al. 2019). Posterior approach has not, to our knowledge, been studied in this context, but many surgeons prescribe movement precautions and mandatory use of ADL equipment to avoid dislocation. Since

Acta Orthopaedica 2021; 92 (4): 413–418

patients with dementia will have difficulties following such precautions, this is another reason why the posterior approach should be particularly avoided in individuals with concomitant cognitive impairment. Several studies have shown age not to be a risk factor (Ninh et al. 2009, Madanat et al. 2012, Abram and Murray 2015, Mukka et al. 2015, Svenoy et al. 2017). In contrast, we found that older patients have lower risk of dislocation. Another Swedish register study found that patients under 75 years of age were at higher risk of reoperation due to dislocations than those over 85 (Leonardsson et al. 2012). Younger patients may have a more active lifestyle, and are therefore more prone to dislocation. Using open secondary surgery as outcome measure can also introduce selection bias, as younger patients more often may be recommended reoperation, whilst old and frail patients more frequently may be treated with repeated closed reductions only. Therefore, we believe that our result, including virtually all dislocations, confirms the lower risk among the oldest. We found no difference in dislocation risk between bipolar and unipolar HA in the logistic regression analysis. The difference in crude dislocation rate may be explained by the posterior approach being more common for bipolar HA, than for unipolar HA. Our observation is supported by earlier studies (Varley and Parker 2004, Enocson et al. 2008, 2012), but Leonardsson et al. (2012) found the bipolar design to be associated with increased risk of revision caused by dislocation. In agreement with earlier studies (Varley and Parker 2004, Figved et al. 2009, Abram and Murray 2015) we found no difference between cemented versus uncemented HA regarding the risk of dislocation. Comorbidity seems not to affect the risk of dislocation, in accordance with previous studies (Madanat et al. 2012, Mukka et al. 2015, Kim et al. 2016, Svenoy et al. 2017). A smaller retrospective study reported male sex as a risk factor (Ninh et al. 2009), but neither we, nor several other studies (Enocson et al. 2008, 012, Leonardsson et al. 2012, Madanat et al. 2012, Abram and Murray 2015, Kim et al. 2016, Svenoy et al. 2017), identified sex as a risk factor. Patient compliance usually influences the choice between THA and HA in clinical everyday life, partly because of the risk of dislocation. Lifestyle factors, including substance or alcohol abuse, can affect patient compliance. However, 2 studies found alcohol abuse disease not to affect the risk of dislocation (Madanat et al. 2012, Svenoy et al. 2017). Obtaining reliable data on whether a patient is addicted is difficult, as both medical records and register data will underestimate the problem. As blunt proxies for socioeconomic distress, we found that education level and marital status did not affect the dislocation rate. The strength of our study is that it includes a large amount of material with a variety of patients and surgeons. We therefore believe that our data reflects everyday practice in Sweden, with good generalizability to the majority of other public

Acta Orthopaedica 2021; 92 (4): 413–418

healthcare systems. With the use of both diagnostic and therapeutic codes for hip dislocation and closed reduction we have tried to cover as many treatment occasions of hip dislocation as possible. Nonetheless, our study also has limitations. We lack information on use of posterior soft-tissue repair or not, implant positioning (no access to postoperative radiographs), and the skills of the surgeons involved. All these factors have been reported to influence the dislocation rate (Enocson et al. 2008, 2012, Madanat et al. 2012, Abram and Murray 2015, Mukka et al. 2015, Kim et al. 2016, Svenoy et al. 2017). Further, information on dementia was missing in 1/6 of the patients. We might also have overestimated the dislocation rate because of lack of information on laterality in the NPR. This means that a dislocation on the opposite side might have been included in those with bilateral arthroplasties. Nevertheless, these events could be expected to be equally distributed between the groups studied, and our numbers are on a par with smaller clinical studies (Enocson et al. 2008, Madanat et al. 2012, Abram and Murray 2015, Svenoy et al. 2017). Finally, we could not account for any differences in time to follow-up between groups of patients with different presumed risk factors e.g., between male and female patients. Previous studies have, however, shown that dislocation is an early complication mainly occurring within the first postoperative months with decreasing incidence up to 1 year (Ninh et al. 2009, Madanat et al. 2012, Gill et al. 2018). Therefore, we think that the main conclusions from this study would have been the same if time to dislocation had been considered in the analyses. We have not found any previous study analyzing the total dislocation rate on a national level. Our findings of a high dislocation rate in Sweden are unsatisfying and suggests that the results can be improved. Such an improvement would include several measures, not least improved surgical technique, especially with use of the posterior approach. We think that it is important to share our results with the orthopedic community and discuss it in the light of both limitations and strengths. One should realize that these patients have neither the time, nor the opportunity, to seek help at another hospital or to find an expert arthroplasty surgeon. Only if we point out how high the dislocation rate is will surgeons realize that “business as usual” is not good enough. There is then fertile soil for introducing better surgical technique(s) and better tutoring. In conclusion, posterior approach for hemiarthroplasty is associated with a higher risk of dislocation compared with direct lateral approach. Patients with dementia have an increased risk. Our data refutes bipolar head as a risk factor for dislocation. Neither did socioeconomic factors or comorbidity play a role. 2 in 3 dislocations were treated with closed reduction only. The outcome measure “revision due to dislocation” underestimates the clinical problem with dislocation.

417

The authors would like to thank all past and present SHAR coworkers in Gothenburg, as well as all surgeons and other healthcare personnel who report to the SHAR, and all participating patients. Conception and design of study: AJ, JK, KÅ, CR. Acquisition of data: AJ, SH, JK, CR. Analysis and interpretation of data: AJ, SH, JK, KÅ, CR. Manuscript drafting: AJ, CR. Revision of the manuscript: AJ, SH, JK, KÅ, CR. Acta thanks Stefan Bolder and Jan-Erik Gjertsen for help with peer review of this study. Abram S G, Murray J B. Outcomes of 807 Thompson hip hemiarthroplasty procedures and the effect of surgical approach on dislocation rates. Injury 2015; 46(6): 1013-17. doi: 10.1016/j.injury.2014.12.016 Bensen A S, Jakobsen T, Krarup N. Dual mobility cup reduces dislocation and re-operation when used to treat displaced femoral neck fractures. Int Orthop 2014; 38(6): 1241-45. doi: 10.1007/s00264-013-2276-8 Burgers P T, Van Geene A R, Van den Bekerom M P, Van Lieshout E M, Blom B, Aleem I S, Poolman R W. Total hip arthroplasty versus hemiarthroplasty for displaced femoral neck fractures in the healthy elderly: a meta-analysis and systematic review of randomized trials. Int Orthop 2012; 36(8): 1549-60. doi: 10.1007/s00264-012-1569-7 Enocson A, Tidermark J, Tornkvist H, Lapidus L J. Dislocation of hemiarthroplasty after femoral neck fracture: better outcome after the anterolateral approach in a prospective cohort study on 739 consecutive hips. Acta Orthop 2008; 79(2): 211-17. doi: 10.1080/17453670710014996 Enocson A, Pettersson H, Ponzer S, Tornkvist H, Dalen N, Tidermark J. Quality of life after dislocation of hip arthroplasty: a prospective cohort study on 319 patients with femoral neck fractures with a one-year followup. Qual Life Res 2009; 18(9): 1177-84. doi: 10.1007/s11136-009-9531-x Enocson A, Hedbeck C J, Tornkvist H, Tidermark J, Lapidus L J. Unipolar versus bipolar Exeter hip hemiarthroplasty: a prospective cohort study on 830 consecutive hips in patients with femoral neck fractures. Int Orthop 2012; 36(4): 711-17. doi: 10.1007/s00264-011-1326-3 Figved W, Opland V, Frihagen F, Jervidalo T, Madsen J E, Nordsletten L. Cemented versus uncemented hemiarthroplasty for displaced femoral neck fractures. Clin Orthop Relat Res 2009; 467(9): 2426-35. doi: 10.1007/ s11999-008-0672-y Gill J R, Kiliyanpilakkill B, Parker M J. Management and outcome of the dislocated hip hemiarthroplasty. Bone Joint J 2018; 100-B(12): 1618-25. doi: 10.1302/0301-620X.100B12.BJJ-2018-0281.R1 Gjertsen J E, LieS A, Vinje T, Engesaeter L B, Hallan G, Matre K, Furnes O. More re-operations after uncemented than cemented hemiarthroplasty used in the treatment of displaced fractures of the femoral neck: an observational study of 11,116 hemiarthroplasties from a national register. J Bone Joint Surg Br 2012; 94(8): 1113-19. doi:10.1302/0301-620X.94B8.29155 Jobory A, Rolfson O, Akesson K E, Arvidsson C, Nilsson I, Rogmark C. Hip precautions not meaningful after hemiarthroplasty due to hip fracture: cluster-randomized study of 394 patients operated with direct anterolateral approach. Injury 2019; 50(7): 1318-23. doi: 10.1016/j.injury.2019.05.002 Kärrholm J, Mohaddes M, Odin D, Vinblad J, Rogmark C, Rolfson O. Swedish Hip Arthroplasty Register 2019. Annual report; 2018 (in Swedish). Kim Y, Kim J K, Joo I H, Hwang K T, Kim Y H. Risk factors associated with dislocation after bipolar hemiarthroplasty in elderly patients with femoral neck fracture. Hip Pelvis 2016; 28(2): 104-11. doi: 10.5371/hp.2016.28.2.104 Kristensen T B, Vinje T, Havelin L I, Engesaeter L B, Gjertsen J E. Posterior approach compared to direct lateral approach resulted in better patientreported outcome after hemiarthroplasty for femoral neck fracture. Acta Orthop 2017; 88(1): 29-34. doi: 10.1080/17453674.2016.1250480 Kristoffersen M H, Dybvik E, Steihaug O M, Kristensen T B, Engesaeter L B, Ranhoff A H, Gjertsen J E. Cognitive impairment influences the risk of reoperation after hip fracture surgery: results of 87,573 operations reported to the Norwegian Hip Fracture Register. Acta Orthop 2020; 91(2): 146-51. doi: 10.1080/17453674.2019.1709712

418

Leonardsson O, Kärrholm J, Åkesson K, Garellick G, Rogmark C. Higher risk of reoperation for bipolar and uncemented hemiarthroplasty. Acta Orthop 2012; 83(5): 459-66. doi: 10.3109/17453674.2012. 727076 Leonardsson O, Rolfson O, Hommel A, Garellick G, Akesson K, Rogmark C. Patient-reported outcome after displaced femoral neck fracture: a national survey of 4467 patients. J Bone Joint Surg Am 2013; 95(18): 1693-99. doi: 10.2106/JBJS.L.00836 Li L, Ren J, Liu J, Wang H, Sang Q, Liu Z, Sun T. What are the risk factors for dislocation of hip bipolar hemiarthroplasty through the anterolateral approach? A nested case-control study. Clin Orthop Relat Res 2016; 474(12): 2622-29. doi: 10.1007/s11999-016-5053-3 Ludvigsson J F, Andersson E, Ekbom A, Feychting M, Kim J L, Reuterwall C, Olausson P O. External review and validation of the Swedish national inpatient register. BMC Public Health 2011; 11, 450. doi: 10.1186/1471-2458-11-450 Madanat R, Makinen T J, Ovaska M T, Soiva M, Vahlberg T, Haapala, J. Dislocation of hip hemiarthroplasty following posterolateral surgical approach: a nested case-control study. Int Orthop 2012; 36(5): 935-40. doi: 10.1007/s00264-011-1353-0 McCollum D E, Gray W J. Dislocation after total hip arthroplasty: causes and prevention. Clin Orthop Relat Res 1990; (261): 159-70. Mukka S, Lindqvist J, Peyda S, Broden C, Mahmood S, Hassany H, Sayed-Noor A. Dislocation of bipolar hip hemiarthroplasty through a postero-lateral approach for femoral neck fractures: a cohort study. Int Orthop 2015; 39(7): 1277-82. doi: 10.1007/s00264-014-2642-1 Mukka S, Knutsson B, Majeed A, Sayed-Noor A S. Reduced revision rate and maintained function after hip arthroplasty for femoral neck fractures after transition from posterolateral to direct lateral approach. Acta Orthop 2017; 88(6): 627-33. Ninh C C, Sethi A, Hatahet M, Les C, Morandi M, Vaidya R. Hip dislocation after modular unipolar hemiarthroplasty. J Arthroplasty 2009; 24(5): 768-74. doi: 10.1016/j.arth.2008.02.019

Acta Orthopaedica 2021; 92 (4): 413–418

NOMESCO. NOMESCO Classification of Surgical Procedures (NCSP), version 1.16; 2011. Retrieved from: http://urn.kb.se/resolve?urn = urn:nbn:se:norden:org:diva-4605 Palan J, Beard D J, Murray D W, Andrew J G, Nolan J. Which approach for total hip arthroplasty: anterolateral or posterior? Clin Orthop Relat Res 2009; 467(2): 473-77. doi: 10.1007/s11999-008-0560-5 Parker M J. Lateral versus posterior approach for insertion of hemiarthroplasties for hip fractures: a randomised trial of 216 patients. Injury 2015; 46(6): 1023-27. doi: 10.1016/j.injury.2015.02.020 Sanchez-Sotelo J, Haidukewych G J, Boberg C J. Hospital cost of dislocation after primary total hip arthroplasty. J Bone Joint Surg Am 2006; 88(2): 290-94. doi: 10.2106/JBJS.D.02799 Statistics Sweden. Available at: https://www.scb.se. (2018). Swedish Fracture Register. https://sfr.registercentrum.se/sfr-in-english/theswedish-fracture-register/p/HyEtC7VJ4, 2018. Svenoy S, Westberg M, Figved W, Valland H, Brun O C, Wangen H, Frihagen F. Posterior versus lateral approach for hemiarthroplasty after femoral neck fracture: early complications in a prospective cohort of 583 patients. Injury 2017; 48(7): 1565-69. doi: 10.1016/j.injury.2017.03.024 Varley J, Parker M J. Stability of hip hemiarthroplasties. Int Orthop 2004; 28(5): 274-77. doi: 10.1007/s00264-004-0572-z WHO. International Statistical Classification of Diseases and Related Health Problems, 10th Revision; 2016. Available at: https://www.who.int/classifications/icd/icdonlineversions/en/. von Elm E, Altman D G, Egger M, Pocock S J, Gotzsche P C, Vandenbroucke J P, STROBE initiative. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. BMJ 2007; 335(7624): 806-08. doi: 10.1136/bmj.39335.541782.AD

Acta Orthopaedica 2021; 92 (4): 419–423

419

Precision of CT-based micromotion analysis is comparable to radiostereometry for early migration measurements in cemented acetabular cups Cyrus BRODÉN 1,2, Olof SANDBERG 3, Henrik OLIVECRONA 4, Roger EMERY 1, and Olof SKÖLDENBERG 2 1 Department

of Surgery and Cancer, Imperial College London, St Mary’s Hospital, London, UK; 2 Department of Clinical Sciences, Karolinska Institutet, Danderyd Hospital, Division of Orthopaedics, Stockholm, Sweden; 3 Sectra, Linköping, Sweden; 4 Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden Correspondence: cyrus.broden@gmail.com Submitted 2020-10-25. Accepted 2021-03-04.

Background and purpose — CT (computed tomography) based methods have lately been considered an alternative to radiostereometry (RSA) for assessing early implant migration. However, no study has directly compared the 2 methods in a clinical setting. We estimated the precision and effective radiation dose of a CT-based method and compared it with marker-based RSA in 10 patients with hip arthroplasty. Patients and methods — We included 10 patients who underwent total hip replacement with a cemented cup. CT and RSA double examinations were performed postoperatively, and precision and effective dose data were compared. The CT data was analyzed with CT micromotion analysis (CTMA) software both with and without the use of bone markers. The RSA images were analyzed with RSA software with the use of bone markers. Results — The precision of CTMA with bone markers was 0.10–0.16 mm in translation and 0.31°–0.37° in rotation. Without bone markers, the precision of CTMA was 0.10–0.16 mm in translation and 0.21°–0.31° in rotation. In comparison, the precision of RSA was 0.09–0.26 mm and 0.43°–1.69°. The mean CTMA and RSA effective dose was estimated at 0.2 mSv and 0.04 mSv, respectively. Interpretation — CTMA, with and without the use of bone markers, had a comparable precision to RSA. CT radiation doses were slightly higher than RSA doses but still at a considerably low effective dose.

Early migration of hip implants is associated with higher revision rates of prosthesis due to aseptic loosening (Kärrholm et al. 1994, Pijls et al. 2012). Radiostereometry (RSA) is the current gold-standard method to measure implant migration, given its accuracy and precision (Valstar et al. 2005). Lately there has been greater interest in using CT scans to measure implant migration to address some of the challenges with RSA, such as the need for specialized equipment and trained personnel to conduct and analyze examinations (Brodén et al. 2016, Otten et al. 2017). Previous experimental and clinical studies indicate that the accuracy and precision of CT techniques are comparable to those of RSA (Brodén et al. 2016, Scheerlinck et al. 2016, Brodén et al. 2019, 2020). However, to our knowledge there is no clinical study directly comparing CT and RSA in terms of precision for migration measurements on the same subjects. Recently a new commercial CT-based method, CT micromotion analysis (CTMA), was developed to analyze and measure implant migration between 2 CT images (Brodén et al. 2019, 2020). CTMA has features that make it possible to perform the migration analysis of CT data with tantalum beads and also with a technique relying solely on the surface anatomy of bone for the image registration, without the use of beads in the bone (Brodén et al. 2020). We compared the precision and effective dose of the 2 methods of CTMA with those of standard marker-based RSA in acetabular cups in patients with total hip arthroplasty (THA).

Patients and methods Study setup We selected the last 10 patients (mean age 67 [59–75]; 6 female; 10 hips) from a randomized study at the Orthopaedic Department of Danderyd Hospital comparing proximal migration of 2 types of cemented cups: an argon gas–sterilized © 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group, on behalf of the Nordic Orthopedic Federation. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. DOI 10.1080/17453674.2021.1906082

420

(1)

Acta Orthopaedica 2021; 92 (4): 419–423

(2)

(3)

(4)

(5)

Figure 1. CTMA analysis workflow, using beads inserted in the bone for pelvic bone image registration. (1) First, a segmentation threshold of 2200 Hounsfield units is set to visualize the metallic structures in the pelvic bone (orange and blue). (2) The beads in the pelvic bone are selected. (3) When the registration occurs, green indicates a 1st successful registration. (4) A segmentation threshold of 2200 Hounsfield units is set to visualize the metallic structures of the cup, such as beads and threads in 2 separate CT images. The beads and the thread of the cup are selected. (5) Registration occurs; green indicates a successful 2nd registration.

polyethylene (PE) group and a Vitamin E–treated PE group (Muller Exceed ABT, Biomet, Warsaw, IN, USA). Included in that study were patients aged 40–75 years who had undergone THA (Sköldenberg et al. 2016). A posterior approach was used for the surgical procedure. The femoral component of the surgery consisted of an uncemented tapered, proximally porous-coated and hydroxyapatite-coated stem composed of a Ti-6Al-4V titanium alloy (BiMetric HA, Biomet) and a 32-mm chromium–cobalt head. Perioperatively, tantalum beads were inserted in the pelvis and liner of the cup. Patients were immediately mobilized at full weight-bearing with walking aids. The registration number of this clinical trial is NCT02254980. Data collection Follow-up for each patient involved a double examination at 1 week or at 3 months postoperatively. 10 double examinations were performed according to the following procedures: (1) positioning the patient in an RSA calibration cage, (2) taking the RSA radiographs, (3) repositioning the X-ray tubes, calibration cage and patient on the table, (4) taking an additional set of RSA radiographs, (5) moving the patient to a CT scanner, (6) taking 1 CT scan, (7) repositioning the patient in the CT scanner, and (8) taking an additional CT scan. RSA method For the RSA technique, we used a uniplanar calibration cage (Cage 43, RSA Biomedical AB, Umea, Sweden). Digital radiographs (Bucky Diagnost Philips, Eindhoven, the Netherlands) were taken using 2 X-ray sources angled at 40° to each other. The exposure was set to 120 kV and 8 mAs for each X-ray tube. UmRSA 6.0 computer software (RSA Biomedical, Umea, Sweden) was used for all the RSA migration analyses. The condition numbers (the spread of the markers) were below 100, and all mean errors of rigid body fitting (the stability of the markers) were below 0.30 mm.

CT method We used a CT scanner (Discovery CT750HD, GE Healthcare, Chicago, IL, USA) to acquire CT scans. The CT protocol was set to 120 kV, 10 mAs, slice thickness 0.625 mm with an increment of 0.31 mm, rotation time 0.6 s, pitch 0.894. Volumes were reconstructed with an x–y–z resolution of 0.6– 0.6–0.6 mm. The CT scans were analyzed with the image registration software CTMA (Sectra, Linköping, Sweden). CTMA makes it possible to analyze migration between 2 rigid bodies, such as the cup and the bone, in between 2 CT examinations (see Figure 1 for steps performance) (Brodén et al. 2020). In this study, the following steps were performed. 1. We imported 2 CT volumes into the CTMA software. A threshold segmentation for beads in the bone (2200 Hounsfield units [HU]) or bone thresholding (600 HU) was set. The segmentation allowed us to visualize the beads or bone depending on the CTMA technique that was selected. The same bone threshold was used for all examinations except in 2 patients, where the threshold was set to 400 HU and 550 HU due to deviances in the CT reconstruction settings. 2. The beads or the surface anatomy of the pelvis was selected for the image registration of the pelvic bone. 3. The pelvic bone was registered, and a visual overlap of the pelvic beads or the surface of the pelvic bone was achieved. 4. A threshold segmentation of 2200 HU was set to visualize the metallic implant structures such as the thread and beads of the cup. The thread and beads were selected for the image registration of the implant. 5. The implant was registered, and a visual overlap of the beads and thread of the implant was achieved. 6. The software calculated the motion of the center of mass of the implant compared with bone between these 2 CT volumes in a CT-based coordinate system. The result was a visual output in the form of registered 3D volumes and numerical migration values expressed in six degrees of freedom (translation along and rotation around x–y–z axis in a CT Dicom coordinate system). 7. The CT coordinate system was thereafter modified in a multi-planar reconstruction (MPR) view to obtain a coordinate system comparable to the RSA coordinate system. The CTMA procedure described in steps 1–7 was performed once using the beads in the bone and once using the surface pelvic anatomy (without beads in the bone) for steps 1–3. This CTMA procedure has previously been illustrated and described (Brodén et al. 2019, 2020). CT and RSA coordinate systems The coordinate systems used in RSA and CTMA differ (Figure 2). The RSA coordinate system is anatomical, fixed, and defined by the RSA calibration cage. The CTMA used the standard Dicom coordinate system that could be modified into that of the RSA coordinate system.

Acta Orthopaedica 2021; 92 (4): 419–423

421

Table 1. Precision comparison between CTMA and gold-standard RSA Factor

Figure 2. CT Dicom coordinate system of CTMA and coordinate system of RSA. The translations are positive in the direction of the arrow, and the rotations are positive in a clockwise direction.

Effective radiation dose The CT effective dose was estimated using the dose length product (DLP) multiplied by a pelvic conversion factor: 0.0129 mSv/mG.cm from IRCP 103, but also with a Monte Carlo simulation in an Impact CT dosimeter software (Deak et al. 2010, Saltybaeva et al. 2014). The RSA effective dose was estimated with a Monte Carlo simulation using a software PCXMC Dose Calculation available at Danderyd Hospital (STUK, Helsinki, Finland version 2.0.1.4) (Tapiovaara and Siiskonen 2008). Precision Precision is defined by the proximity between repeated measurements under similar conditions (ISO 16087:2013 2013). According to this standard, “Precision should be assessed with double measurement and shall be presented as the standard deviation of these calculated migrations. Assuming a normal distribution, the confidence intervals of the error should be expressed as ± 1,96×SD, for the 95% confidence interval (where SD is the standard deviation).” Statistics For the precision calculation, we used the ISO standard for the 95% confidence interval (CI) 1.96 x SD, but modified to the formula T-score x SD to take into account the small sample size, a practice prevalent in previous studies using this method (Sköldenberg and Odquist 2011, Brodén et al. 2020). The T-score was used instead of a Z-score to avoid underestimating the margin of error obtained with this small sample size. Ethics, data sharing plan, funding, and conflicts of interests The Ethical Committee of the Karolinska Institute approved the use of CT scans in this study (No. 2011/2003-31/1). The clinical data of this study will be available upon request at: cyrus.broden@gmail.com. OSK and RE did not have any conflict of interests. CB and HO received consultancy fees from Sectra Orthopaedics. OSA is a full-time employee at Sectra. Sectra Orthopaedics had no involvement in the data collection, analysis or interpretation of the data.

Precision CTMA Precision CTMA Precision RSA beads in bone bone anatomy beads n = 10 n = 10 n=9

X translation, mm Y translation, mm Z translation, mm X rotation, ° Y rotation, ° Z rotation, °

0.14 0.10 0.16 0.31 0.37 0.33

0.16 0.14 0.10 0.25 0.21 0.31

0.09 0.20 0.26 1.69 1.56 0.43

CT scans were analyzed with different modes of CTMA registrations (beads in bone or bone anatomy). “n” is the number of subjects.

Results The precision of CTMA using tantalum beads inserted in the bone for the pelvic bone registration varied between 0.10 and 0.16 mm in translation and 0.31° and 0.37° in rotation (Table 1, Table 2 Supplementary data). All the implanted beads in the bone and the implant itself were visualized. The precision of CTMA using bone surface anatomy for the bone registration varied between 0.10 and 0.16 mm in translation and 0.21° and 0.31° in rotation (Table 1, Table 3 Supplementary data). Some artefacts could be observed at the surface of the pelvic bone, but no patients needed to be excluded. The precision of the gold-standard RSA for this cohort varied between 0.09 and 0.26 mm in translation and 0.43° and 1.69° in rotation (Table 1, Table 4 Supplementary data). One patient was excluded from the RSA measurements due to marker occlusion. The CT mean effective radiation dose of the scans used in the CTMA analysis was estimated to be 0.2 mSv (0.10–0.22) with both techniques of dose calculation. The RSA mean effective dose was estimated to be 0.04 mSv (0.036–0.044).

Discussion The CTMA method was more precise in terms of rotation compared with RSA. We speculate this is because CT images made additional surfaces available compared with RSA for the image analysis, such as the surface anatomy of the pelvis, metallic thread of the cups, and additional markers since marker occlusion did not occur. This created a more rotational stable rigid body and improved the precision of our registrations in CTMA. These results are in accordance with a clinical study of another CT-based method conducted by Otten et al. (2017) where the limit of agreement between the CT and RSA varied between 1.05° and 2.17° in rotation. However, in our study CTMA using beads and surface anatomy was as precise

422

as standard RSA in translation. These results were below the critical threshold of 1 mm of early migrations that is suggested to predict loosening and could therefore be used in a clinical setting (Pijls et al. 2012). The precision of RSA for this cohort varied between 0.09 and 0.26 mm in translation and 0.43° and 1.69° in rotation. This is in accordance with a review paper by Kärrholm (2012) indicating that RSA precision in clinical studies varied between 0.15 and 0.60 mm for translations and between 0.3° and 2° for rotation. These values are less precise than those of RSA in an experimental setting, especially in rotation, ranging from 0.04 to 0.09 mm for translations and 0.08° to 0.32° for rotations (Brodén et al. 2016). This difference might be due to the use of sawbones without soft tissues in the experimental study by Brodén et al. The CTMA software without beads includes only common anatomical structures between 2 CT volumes for image registration. Therefore, anatomical changes such as ectopic bone formation, cortical thickening, or osteolysis, which might occur over a longer study time, might possibly not influence the precision of the method. However, with methods relying on beads inserted in the bone, bone remodelling over time might affect the positioning of the beads, which could affect the extrapolation of postoperative precision data to other timepoints. The CT effective dose was estimated using the dose length product (DLP) multiplied by a pelvic conversion, the most commonly used method in clinical routine (Deak et al. 2010). However, we also estimated the effective dose with a Monte Carlo simulation in Impact CT dosimeter software, since it is considered the most accurate effective dose-estimation method, due to its ability to model the interaction between radiation and matter, considering several parameters of the CT protocol (Saltybaeva et al. 2014). The mean effective dose of CTMA with both methods was 0.2 mSv, while the clinical RSA effective dose was 0.04 mSv. In a study by Blom et al. (2020), the effective dose of RSA was estimated to be 0.04 mSv, similar to our findings. The complexity of acquiring RSA images where patient and calibration cage are lined up correctly means that some additional retakes of RSA images could increase the effective dose for RSA. Retakes were not included in our calculation of RSA effective dose. The effective dose for CT is 5 times higher than for RSA in our study. However, it is lower than a standard anteroposterior pelvic radiograph (0.7 mSv) (Boettner et al. 2016). Moreover, the practicalities of CT, in addition to the potential to visualize the bone–implant interface, might justify the slightly higher dose. It is important to consider that this CT dose is suboptimal, because artefacts were observed, and higher doses could theoretically facilitate the analysis of CT scans with CTMA relying solely on the surface anatomy of the bone. The CT dose in this study is markedly at the lower end compared with those used in other CTMA studies with an effective dose

Acta Orthopaedica 2021; 92 (4): 419–423

ranging between 0.2 and 2.3 mSv (Brodén et al. 2020). However, CTMA precision does not appear to have been markedly affected by this lower dosage. An advantage of CTMA technology is the wide availability of CT scanners in hospitals, which facilitates early migration measurement of implants without the need for expensive investments in RSA laboratories. For RSA, imaging acquisition must be carefully monitored by a specially trained radiographer. For CTMA, a predetermined CT-scan protocol is chosen by a radiographer for the scan, and the positioning of the patient is not as crucial as in RSA, as no calibration cage is involved. Another advantage of CTMA is that the registration is performed in a 3D visual interface, which prevents the loss of tantalum markers due to marker occlusion. In our study, 1 RSA examination had to be excluded due to marker occlusions, which did not occur with CTMA. A limitation of this study is the small sample size. No statistical test was performed due to the small sample size. A limitation of the CTMA software is that the quality of the image registration must be verified manually. This is done in a 1st step with a visual colourmap feedback mechanism. In a 2nd step, a visual inspection in 2D of the axial, frontal, and sagittal view of the CT images is performed after each registration to verify if the bone or implant is well registered. Currently in CTMA, there is no equivalent number to the condition numbers and mean error of rigid body fitting to quantify the suitability of the rigid body to give correct measurements; this assessment must rely on the user’s experience and judgment. The CTMA analyses were performed by an experienced CTMA user (CB). Although Sandberg et al. (2020) previously have shown that CTMA precision can be reproduced by an inexperienced user, it is important to emphasize the importance of CTMA training before the use of this tool. We did not investigate the precision of CTMA for the femoral stem. For future investigations, a slightly higher effective dose might be needed to analyze the femur/stem migration. The stem is a metallic component, and the presence of soft tissue around the femur might increase the effective dose needed to obtain CT images of adequate quality without artefacts. Moreover, the whole stem was not included in the field of view of the CT scans, which could impact the ability to measure stem precision. In conclusion, we found that CTMA, with and without the use of bone markers for image registration, had a comparable precision when compared with standard RSA, and a slightly higher effective dose for cups in hip arthroplasty while still at a considerably low effective dose. Supplementary data Tables 2–4 are available as supplementary data in the online version of this article, http://dx.doi.org/10.1080/17453674.20 21.1906082

Acta Orthopaedica 2021; 92 (4): 419–423

CB, OSA, OSK, and HO were mainly responsible for the study conception and designed the study. CB and OSA drafted the manuscript. CB and OSK analyzed the data. RE, OSK, and HO gave feedback on the study design, reviewed the manuscript, and contributed with comments on the paper. The authors would like to thank Lise-Lotte Widmark for performing the CT scans and RSA examinations, and Ulf Petersen and Dina Tamras Skårfors for help with dose calculations. Acta thanks Bart G Pijls and Stephan Röhrl for help with peer review of this study.

Blom I F, Koster L A, Brinke B T, Mathijssen N M C. Effective radiation dose in radiostereometric analysis of the hip. Acta Orthop 2020; 91(4): 1-395. Boettner F, Sculco P, Lipman J, Renner L, Faschingbauer M. A novel method to measure femoral component migration by computed tomography: a cadaver study. Arch Orthop Trauma Surg 2016; 136(6): 857-63. Brodén C, Olivecrona H, Maguire G Q, Noz M E, Zeleznik M P, Sköldenberg O. Accuracy and precision of three-dimensional low dose CT compared to standard RSA in acetabular cups: an experimental study. BioMed Res Int 2016; 2016: 5909741. Brodén C, Giles J W, Popat R, Fetherston S, Olivecrona H, Sandberg O, Maguire G Q, Noz M E, Sköldenberg O, Emery R. Accuracy and precision of a CT method for assessing migration in shoulder arthroplasty: an experimental study. Acta Radiol Stockh Swed 2019; 284185119882659. Brodén C, Sandberg O, Sköldenberg O, Stigbrand H, Hänni M, Giles J W, Emery R, Lazarinis S, Nyström A, Olivecrona H. Low-dose CT-based implant motion analysis is a precise tool for early migration measurements of hip cups: a clinical study of 24 patients. Acta Orthop 2020; 91(3): 260-5. Deak P D, Smal Y, Kalender W A. Multisection CT protocols: sex- and age-specific conversion factors used to determine effective dose from doselength product. Radiology 2010; 257(1): 158-66. International Organization for Standardization. ISO 16087:2013. Implants for surgery- Roentgen stereophotogrammetric analysis for the assessment of migration of orthopaedic implants; 2013.

423

Kärrholm J. Radiostereometric analysis of early implant migration: a valuable tool to ensure proper introduction of new implants. Acta Orthop 2012; 83(6): 551-2. Kärrholm J, Borssén B, Löwenhielm G, Snorrason F. Does early micromotion of femoral stem prostheses matter? 4–7-year stereoradiographic follow-up of 84 cemented prostheses. J Bone Joint Surg Br 1994; 76(6): 912-17. Otten V, Maguire G Q, Noz M E, Zeleznik M P, Nilsson K G, Olivecrona H. Are CT scans a satisfactory substitute for the follow-up of RSA migration studies of uncemented cups? A Ccomparison of RSA double examinations and CT datasets of 46 total hip arthroplasties. BioMed Res Int 2017; 2017: 3681458. Pijls B G, Nieuwenhuijse M J, Fiocco M, Plevier J W, Middeldorp S, Nelissen R G, Valstar E R. Early proximal migration of cups is associated with late revision in THA: a systematic review and meta-analysis of 26 RSA studies and 49 survival studies. Acta Orthop 2012; 83(6): 583-91. Saltybaeva N, Jafari M E, Hupfer M, Kalender W A. Estimates of effective dose for CT scans of the lower extremities. Radiology 2014; 273(1): 153-9. Sandberg O, Tholén S, Carlsson S, Wretenberg P. The anatomical SP-CL stem demonstrates a non-progressing migration pattern in the first year: a low dose CT-based migration study in 20 patients. Acta Orthop 2020; 91(6): 654-9. Scheerlinck T, Polfliet M, Deklerck R, Van Gompel G, Buls N, Vandemeulebroucke J. Development and validation of an automated and markerfree CT-based spatial analysis method (CTSA) for assessment of femoral hip implant migration: in vitro accuracy and precision comparable to that of radiostereometric analysis (RSA). Acta Orthop 2016; 87(2): 139-45. Sköldenberg O, Odquist M. Measurement of migration of a humeral head resurfacing prosthesis using radiostereometry without implant marking: an experimental study. Acta Orthop 2011; 82(2): 193-7. Sköldenberg O, Rysinska A, Chammout G, Salemyr M, Muren O, Bodén H, Eisler T. Migration and head penetration of Vitamin-E diffused cemented polyethylene cup compared to standard cemented cup in total hip arthroplasty: study protocol for a randomised, double-blind, controlled trial (E1 HIP). BMJ Open 2016; 6(7): e010781. Tapiovaara M, Siiskonen T. PCXMC: a Monte Carlo program for calculating patient doses in medical X-ray examinations 2nd ed. Helsinki: STUK; 2008. Valstar E R, Gill R, Ryd L, Flivik G, Börlin N, Kärrholm J. Guidelines for standardization of radiostereometry (RSA) of implants. Acta Orthop 2005; 76(4): 563-72.

424

Acta Orthopaedica 2021; 92 (4): 424–430

Sex differences in incidence rate, and temporal changes in surgical management and adverse events after hip fracture surgery in Denmark 1997–2017: a register-based study of 153,058 hip fracture patients Liv R WAHLSTEN 1, Henrik PALM 2, Gunnar H GISLASON 3, and Stig BRORSON 4 1 Department of Orthopaedics, Copenhagen University Hospital Herlev-Gentofte; 2 Department of Orthopaedics, Copenhagen University Hospital Bispebjerg; 3 Department of Cardiology, Research 1, Copenhagen University Hospital Herlev-Gentofte; 4 Department of Orthopaedic Surgery, Zealand

University Hospital Køge, Denmark Correspondence: posttilliv@gmail.com Submitted 2020-09-24. Accepted 2021-03-11.

Background and purpose — Extensive research and national multidisciplinary programs have striven to introduce uniform standards of treatment and mitigate mortality and adverse events after hip fracture surgery over the past decades. A large-scale overview of temporal developments in hip fracture surgery and care is warranted. Patients and methods — We studied Danish patients aged ≥ 60 years, sustaining their first ever hip fracture between 1997 and 2017. Patients were identified from the Danish National Patient Registry (DNPR). Incidence rates of first hip fracture were calculated per 1,000 patient-years and stratified by age group and sex. Information on preinjury living settings, comorbidities, and medications were obtained from national administrative registers. Type of fracture and treatment choice were recorded, and patients were followed for 1 year to observe mortality, readmission, and surgical complications. Results — Data from 153,058 patients was analyzed. Incidence rate decreased in both sexes, but only led to a reduction in the annual number of hip fractures in the female population. Choice of surgery shifted away from sliding hip screws and parallel implants (SHS-PI), towards intramedullary nailing and hemi-/arthroplasties for trochanteric and femoral neck fractures, respectively. Pre-injury diagnosed morbidity and 1-year readmissions increased contrary to mortality. Median age remained stable around 83 (IQR 77–88) for women and 80 (IQR 73–86) for men. Interpretation — Over the past 2 decades important aspects of hip fracture management have improved. However, sex differences were observed, and men remain more vulnerable than women in terms of morbidity, mortality, and incidence rate.

The collateral effects of hip fractures are tremendous in terms of healthcare costs, morbidity, mortality, and lost quality of life for the patients and their families (Vochteloo et al. 2013, Schemitsch et al. 2019). Increasing incidence of hip fractures in the last 50 years, longer life expectancy, and large birth cohorts during World War II have predicted a significant rise in the incidence of hip fractures in the coming decades (Cooper et al. 1992, 2011, Rosengren and Karlsson 2014). In the past 20 years, extensive research has striven to develop and amend fracture prevention, algorithms for operative treatment, and multidisciplinary approaches to care and rehabilitation, in order to improve survival and mitigate adverse events after hip fractures. Many aspects thereof have been analyzed, changed, and evaluated as the perception of hip fracture surgery has shifted from routine surgery often performed by younger surgeons to a global ambition for complex highly specialized treatment regimens in multidisciplinary settings. In 1999 the first national Danish reference program for the management of hip fractures emerged as a part of a government-initiated National Indicator Project (NIP) aimed to improve uniformity and quality. The aim was to suggest, describe and implement quality indicators suitable for benchmarking and monitoring the management of hip fractures in Denmark across regions and departments. Since then, several national cross-sectional quality projects have been deployed. A large-scale overview of temporal developments in hip fracture surgery and care is warranted to help clinicians, politicians, and caretakers to see new opportunities to improve safety and care for patients with hip fractures in the future. In this study we report how the Danish hip fracture population and management have changed over 2 decades in terms

Acta Orthopaedica 2021; 92 (4): 424–430

of incidence rates, pre-fracture comorbidity, choice of primary implant, surgical complications, readmissions, and mortality after hip fracture surgery.

Patients and methods Throughout the study period, the entire Danish population of 5.6 million inhabitants has been covered by tax-financed public health and social care insurance, securing free and equal access to healthcare services free of personal charge. Study population All Danes aged ≥ 60 years undergoing their first operative procedure due to a hip fracture, between 1997 and 2017, were identified and followed up 1 year from the day of admission or until death, whichever came first. Patients were included if they had a hip fracture diagnosed and surgery in the same hospital stay or overlapping hospitals stays. Patients with prior diagnosis or procedures compatible with having a hip fracture were excluded. The positive predictive value of the hip fracture diagnosis code and hip fracture surgical procedure codes in DNPR are above 90% separately (Hjelholt et al. 2020). By combining diagnosis and surgery codes, the positive predictive value of the hip fracture diagnoses was suspected to be even higher. ICD-8 and 10 codes and NSCP codes used to define the population are listed in Table 1 (see Supplementary data). Data sources In the Danish Central Population Register a unique and permanent civil registration number for each individual resident in Denmark is held. Upon contact with any part of the public system, e.g., healthcare services, the civil registration number is registered in the appropriate administrative register along with an action code. Linkage of information across registries is feasible on the individual level through the civil registration number. Registrations from 5 different national administrative registries were retrieved in this study: the Central Population Registry, Danish National Patient Registry, National Prescription Registry, Nursing Home Registry, and Household Income Registry. Pre-injury status, type of fracture, and choice of implants Information on pre-existing comorbidity, defined by the international ICD-10/8 system, was retrieved from the Danish National Patient Registry (DNPR). Diagnoses given or renewed 10 years prior to admission were considered active. To ensure accuracy some conditions, e.g., chronic obstructive pulmonary disease (COPD), diabetes, and depression, were defined from a combination of diagnosis and drug usage. Data on concomitant medications according to Anatomical Therapeutic Chemical Classification (ATC)

425

codes was collected from the National Prescription Registry. Claimed prescriptions within 6 months prior to admission for hip fracture were considered to represent active treatment. Information on pre-injury living settings, i.e., living alone or co-living, was drawn from the Household Income Registers provided by Statistics Denmark based on “family type.” Living in a nursing home prior to admission was defined as the shift from living in an unstaffed private home to a permanent institutional residence. Based on procedure codes registered in the DNPR, surgical treatment was categorized as either “arthroplasty”—covering hemi- and total arthroplasties, intra-medullary nailing (IMN), or sliding hip screws (SHS) and parallel implants (PI), collectively referred to as SHS-PI. SHS and PI were gathered into one category because of overlapping indications and the treatment choice is mostly determined by local tradition. Data on surgical complication rates was retrieved from the DNPR and defined as the 1st occurrence of removal of an implant, reoperation due to superficial or deep infection or dislocation of arthroplasty, within 1 year of surgery Statistics Differences in the distribution of comorbidity and living setting at baseline, between males and females, were calculated using Student’s t-test for continuous variables and a chi-square test for categorical variables. Age and sex stratified incidence rates were calculated in a time-updated model as number of first ever hip fractures per 1,000 person-years in that age and sex category, where the number of person years and new events, in each category, were updated each month. Cumulative incidence functions were applied when time to event and absolute crude risks were of interest, i.e., mortality and readmissions. For readmissions, the Aalen–Johansen estimator was used in amendment of the cumulative incidence function to calculate competing risk of death. 95% confidence intervals (CI) were calculated for all estimated parameters. All analyses were performed with SAS statistical software version 9.4 (SAS Institute Inc, Cary, NC, USA) and R-studio version 3.2.2 (2016-10-31) (R Foundation for Statistical Computing, Vienna, Austria). Ethics, funding, and potential conflicts of interest In Denmark, registry-based studies do not require ethical committee approval or patient consent if the study is conducted for the sole purpose of statistics and scientific research as defined in the Data Protection Act. Approval to use the data sources for this research project was granted by the data responsible institute in the Capital Region of Denmark in accordance with the General Data Protection Regulation (approval number, P-2019-404). This research did not receive grants from any funding agency in the public, commercial, or not-for-profit sectors. There are no conflicts of interest to declare for any of the authors.

426

Acta Orthopaedica 2021; 92 (4): 424–430

Table 2. Study population with living status, fracture type, and pre-existing comorbidity. Values are count (%) unless otherwise specified Factor

Female n = 6,144

Age (IQR) Living alone Resident in nursinghome Femoral neck fractures Diabetes Depression Parkinsons disease Dementia Heart disease COPD a Osteoporosis Cancer

83 (77–88) 80 (74–86) 4,541 (74) 996 (50) 1,136 (19) 306 (15) 3,422 (56) 1,158 (58) 449 (7.3) 183 (9.2) 1,410 (23) 317 (16) 277 (4.5) 115 (5.8) 354 (5.8) 107 (5.4) 1,789 (29) 700 (35) 326 (5.3) 206 (10) 464 (7.6) 33 ( 1.7) 650 (11) 263 (13)

1997

Male n = 1,983 p-value < 0.001 < 0.001 0.002 0.04 0.006 < 0.001 0.02 0.6 < 0.001 < 0.001 < 0.001 0.001

2007 2017 Female Male Female Male n = 5,016 n = 1,989 p-value n = 4,079 n = 2,031 83 (77–88) 81 (73–86) < 0.001 3,568 (71) 1,017 (51) < 0.001 982 (20) 333 (17) 0.007 2,690 (54) 1,030 (52) 0.2 569 (11) 303 (15) < 0.001 1,707 (34) 583 (29) < 0.001 173 (3.4) 111 (5.6) < 0.001 751 (15) 293 (15) 0.8 2,293 (46) 1,112 (56) < 0.001 539 (11) 259 (13) 0.008 810 (16) 160 (8.0) < 0.001 648 (13) 364 (18) < 0.001

p-value

83 (76–89) 80(72–87) < 0.001 3,090 (76) 1,171 (58) < 0.001 429 (11) 179 (8.8) 0.04 2,213 (54) 1,097 (54) 0.9 551 (14) 364 (18) < 0.001 1,163 (29) 462 (23) < 0.001 169 (4.1) 125 (6.2) 0.001 610 (15) 261 (13) 0.03 1,975 (48) 1,182 (58) < 0 .001 494 (12) 288 (14) 0.03 1,101 (27) 261 (13) < 0.001 687 (17) 438 (22) < 0.001

Chronic obstructive pulmonary disease.

age and sex are displayed. Absolute crude annual numbers of first hip fractures and person40 3 years in the Danish population 2,000 30 are also displayed to demon2 strate the demographic devel20 opment in each age stratum. A 1,000 1 steady decrease in incidence 10 was seen in all age groups in the female hip fracture popula0 0 0 1997 2002 2007 2012 2017 1997 2002 2007 2012 2017 1997 2002 2007 2012 2017 tion. The decrease in incidence rate among the 71–80-year-old Person-years x 10 in all Danish men Hip fractures per year in Danish men Hip fractures per 1,000 person-years 4 3,000 50 women from 7.9 (CI 7.5–8.2) Age 60–70 Age 71–80 Age 81–90 per 1,000 person years (kPY) Age > 90 40 3 to 3.9 (CI 3.7–4.2) per kPY 2,000 and the 81–90-year-old women 30 2 from 24 (CI 23–25) per kPY to 20 14 (CI 13–15) per kPY (Figure 1,000 1a) had the largest impact on the 1 10 total annual number of first hip fractures. The curve flattened in 0 0 0 1997 2002 2007 2012 2017 1997 2002 2007 2012 2017 1997 2002 2007 2012 2017 the last part of the study period Figure 1. Incidence of hip fracture in the Danish population 1997–2017. for the 71–80-year-olds, due to a. Top row. Person-years in the entire Danish population of women, absolute number of hip fractures per large birth cohorts during World year among women, and incidence rate of hip fractures among women, all displayed in age strata. b. Bottom row. Person-years in the entire Danish population of men, absolute number of hip fractures per War II. A more modest reducyear among men, and incidence rate of hip fractures among men, all displayed in age strata. tion in incidence was observed in men (Figure 1b). The maximum decrease was seen among the 71–80-year-olds where a reduction from 4.2 (CI 3.9–4.5) to 2.4 (CI 2.2–2.6) per kPY Results led to a slight fall in annual number of hip fractures up to Incidence of hip fractures 2012 but was exceeded by the rise in person-years from 2013 From 1997 to 2017, 153,058 Danish patients sustained their onwards. Overall, the increase in person-years introduced by first hip fracture. The annual number of first hip fractures fell the large birth cohorts during World War II and increasing life gradually during the entire study period from 8,127 patients expectancy was balanced by a decrease in incidence in men in 1997 to 6,110 in 2017 (Table 2). In Figure 1a incidence but exceeded by the decrease in women, which led to a stable rates of patients suffering their first hip fracture stratified by annual number of fractures in men and a decrease in women. Person-years x 105 in all Danish women

Hip fractures per year in Danish women

Hip fractures per 1,000 person-years

3,000

Age 60–70 Age 71–80 Age 81–90 Age > 90

Acta Orthopaedica 2021; 92 (4): 424–430

427

Distribution (%) – femoral neck fracture

Distribution (%) – trochanteric fracture

100

1997 2007 2017

SHS-PI

IMN

Arthroplasty

SHS-PI

IMN

Arthroplasty

Figure 2. Temporal trends in operative management of hip fractures. SHS-PI: Sliding hip screws and parallel implants, IMN: Intramedullary nailing.

Pre-injury status On average women were 2.6 years older than men when sustaining their first hip fracture, with a median age of 83 years (interquartile range [IQR] 77–89). Only small variations were seen in age over time. More women were living alone. The proportion of patients who were resident in nursing homes at the time of injury was higher in 1997 and fell successively in 2007 and 2017 for both men and women (Table 2). The burden of diagnosed diseases before sustaining a hip fracture increased during the study period for both sexes, but in general men had more diagnosed diseases than women except for depression and osteoporosis, which were more common in women (Table 2). Complications (%) – SHS-PI

Complications (%) – IMN

15 Implant removal – trochanteric fractures Implant removal – femoral neck fractures Implant infection

Implant removal Implant infection

0 1997

Type of fracture, choice of treatment, and surgical complications In Figure 2, the choice of surgical treatment according to fracture type is shown for the years 1997, 2007, and 2017. The use of SHS-PI fell gradually throughout the study period from 48% (CI 47–49) and 34% (CI 32–35) in femoral neck and trochanteric fractures in 1997 to 25% (CI 23–26) and 16% (CI 15–17) in 2017, respectively, and were replaced by an increase in arthroplasties and intramedullary nailing. The proportion of trochanteric versus femoral neck fracture was stable. Fracture-related complications or secondary surgery measured by removal of implants, implant related infections, or dislocations of arthroplasties in the first postoperative year is shown in Figure 3. Removal of implants was more frequent after SHS-PI of femoral neck fractures, compared with SHS-PI of trochanteric fractures and intramedullary nailing and arthroplasties. The latter showed a decrease in implant removal over time, but also a tendency towards an increase in dislocations. Length of stay, readmission, and mortality The median length of stay in hospital shortened from 13 days (IQR 7–22) in 1997 to 6 days (IQR 4–9) in 2017 for both sexes (Table 3). There was an inverse relationship between absolute risk of readmission and time from discharge, which can be observed in the slope of the cumulative incidence curve in Figure 3. The rate and pace of readmission were similar in 2007 and 2017, but lower in 1997 (Figure 4). The 1-year cumulative incidence of postComplications (%) – Arthroplasty 15 operative mortality was Implant removal similar in 1997 and 2007 Implant infection Dislocation but fell gradually from 2007 to 2017, to 30% 10 and 23% for men and women (Table 3, Figure 5). Mortality was higher 5 among men at all measured timepoints.

0 2002

2007

2012

2017

2002

2007

2012

2017

0 1997

2002

2007

2012

2017

Figure 3. Surgical complications in the first postoperative year. SHS-PI: Sliding hip screws and parallel implants, IMN: Intramedullary nailing.

Table 3. Median length of stay in hospital and cumulative postoperative mortality. Values are count (%) unless otherwise specified Factor Length of stay (IQR) In hospital mortality 30-day mortality 1-year mortality

Female n = 6,144

1997

13 (7–22) 324 (5.3) 501 (8.2) 1,568 (25)

Male n = 1,983 p-value 13 (7–21) 167 (8.4) 266 (13) 711 (36)

2007 2017 Female Male Female Male n = 5.016 n = 1,989 p-value n = 4,079 n = 2,031

NA 9 (5–13) < 0.001 165 (3.3) < 0.001 413 (8.2) < 0.001 1,335 (27)

9 (5–13) 130 (6.5) 291 (15) 724 (36)

NA < 0.001 < 0.001 < 0.001

6 (4–9) 100 (2.5) 316 (7.7) 942 (23)

6 (4–9) 102 (5.0) 235 (12) 612 (30)

p-value NA < 0.001 < 0.001 < 0.001

428

Acta Orthopaedica 2021; 92 (4): 424–430

Cumulative readmission incidence (%)

Cumulative mortality (%) – 1997

Cumulative mortality (%) – 2007

Cumulative mortality (%) – 2017

Male Female

2017 2007 1997

Male Female

0 0

180

270

365

0 0

Days from discharge

Figure 4. All-cause readmission after hip fracture surgery. At risk D 0

D 90 D 180 D 270 D 365

1997: 8,127 5,254 4,503 3,929 3,470 2007: 7,005 4,415 3506 3,068 2,741 2017: 6,110 3,741 3,255 2,831 2,536

180

270

365

0 0

180

Days from surgery

270

365

Days from surgery

180

270

365

Days from surgery

Figure 5. 1-year mortality after hip fracture surgery. At risk

D 0

D 90 D 180 D 270 D 365

Female: 6,144 5,283 4,988 4,804 4,627 Male: 1,983 1,561 1,437 1,351 1,293

Discussion Key findings In this study of 153,058 Danish patients who suffered their first hip fracture in 1997–2017, we found the incidence and the overall annual number of hip fractures to be steadily declining, especially for women, despite strong secular trends towards an ageing population. During the study period the average age of primary hip fracture was unchanged, while the level of pre-injury comorbidities increased. For both femoral neck and trochanteric fractures, the treatment preference moved away from SHS-PI towards arthroplasties and intramedullary nailing, respectively. The median length of stay in hospital shortened contrary to readmission rates, which were lower in 1997 compared with 2007 and 2017. Complications showed noticeable fluctuations over time. Mortality decreased in all measures for both sexes in the last decade of observation compared with the first. Incidence and comorbidities of hip fractures The incidence of hip fractures increased in many parts of the world until the late 1990s; around the millennium a plateau effect was described and after that a fall in incidence has been observed in many industrialized countries (Ahlborg et al. 2010, Sullivan et al. 2016, Rosengren et al. 2017). It has been debated whether the decrease in incidence rate of hip fractures would be enough to balance the effect of a growing population of elderly individuals (Korhonen et al. 2013, Rosengren and Karlsson 2014, Lewiecki et al. 2018). In the female population the decrease in incidence was sufficient to cause fewer annual hip fractures, whereas the incidence rate in the male population only kept the total annual number at a stable level. The cause of the difference between the sexes is unexplained, but merits further research as it may hold a causal inference that can reveal modifiable factors which can aid prevention in the future. One important aspect is the

At risk

D 0

D 90 D 180 D 270 D 365

Female: 5,016 4,220 3,995 3,827 3,700 Male: 1,989 1,544 1,428 1,353 1,275

At risk

D 0

D 90 D 180 D 270 D 365

Female: 4,079 3,535 3,371 3,257 3,149 Male: 2,031 1,648 1,573 1,496 1,426

difference in prevalence of osteoporosis between women and men, where women have a 2–3 times higher prevalence than men (Abrahamsen and Vestergaard 2010) and men may exhibit different disease patterns than women (Haentjens et al. 2004). The extensive focus on osteoporosis in women may have led to a perception of osteoporosis as a women’s disease, leading to a negative detection bias and more pronounced diagnostic and treatment deficits in men (Vestergaard et al. 2005). Pre-existing comorbidity is a strong predictor of rehabilitation outcome and mortality after hip fracture (Roche et al. 2005, Vestergaard et al. 2007, Sheehan et al. 2018). From 1997 to 2017 a substantial increase in the proportion of patients with comorbidity was observed for several diagnoses i.e., diabetes, osteoporosis, COPD, cancer, heart disease, and dementia. Rather than an expression of declining health status in the hip fracture population, this development represents the increased focus on health and wellbeing of the elderly over the past 20 years, leading to more rigorous examinations, better diagnosis, and more treatment deployed earlier in the course of a disease. This aspect correlates well with improved survival in the hip fracture population and the rise in mean life expectancy in the general population. Type of fracture, choice of treatment, and complications Operative management of hip fractures has been a topic of debate for many years, regarding choice of procedure, implant, approach, and anesthetic strategy. In the last 20 years evidence-based treatment algorithms have been introduced and widely used to determine surgical treatment in Denmark (Palm et al. 2012). Focus on uniform treatment guidelines may have contributed to the shift from SHS-PI toward arthroplasties and intramedullary nailing for femoral neck and trochanteric fractures, respectively (Rogmark and Leonardsson 2016). Complications leading to removal of implants have fluctuated in frequency over the years, especially for patients with femoral neck fractures who were

Acta Orthopaedica 2021; 92 (4): 424–430

treated with SHS-PI. The term “complication” is widely used for secondary procedures, but may be inappropriate in some cases, especially for removal of implants. While the majority of implant removals after femoral neck fractures are likely to emanate from femoral head necrosis, cut-out of implants, and non-union problems, some implant removals may be requested by the patient due to minor soft tissue irritation after successful bone healing. Detailed information regarding the cause of removal was not obtainable for this study. For SHS-PI and intramedullary nailing, there was a tendency to a rise in frequency from 1997 to 2007, but a decline from 2007 onwards, leading to an overall reduction during the study period. Local initiatives to reduce complication rates may show clear efficacy that is not detectable on a national level. In 2 prospective Swedish cohort studies of 1,111 hip fracture arthroplasties, dislocations were 3–4 times higher with the traditional posterolateral approach compared with an anterolateral approach (Enocson et al. 2008, Sköldenberg et al. 2010), while the only RCT between the lateral approach or the piriformis-saving posterior approach used for 216 patients by the same primary surgeon saw no difference (Parker 2015). In Denmark the posterolateral approach has been widely used for decades despite the unchanged high dislocation rate. Increasing focus has, however, emerged in the most recent years with 2 schools of thought: either preserving the piriformis tendon or implementing the anterolateral approach. Length of stay, readmission, and mortality The introduction of fast-track programs for hip fracture patients in the late 1990s has been widespread, although only partially implemented in many centers (Egerod et al. 2010). The benefits of short total length of stay are controversial, but most studies find that this does not appear to be harmful (Haugan et al. 2017, Pollmann et al. 2019) down to a certain level. As in other countries, the median length of hospital stay was reduced in Denmark from 1997 to 2017, but an increase in frequency and pace of readmission was observed collaterally (Figure 4). The rise in frequency of readmission could be multifactorial, e.g., changes in the standard of care, enhanced survival that made more patients eligible for readmission, changes in discharge criteria leading to premature discharge, or a lower threshold for hospital admissions in general etc. This area deserves further investigations. In the first 10 years, a reduction in mortality was seen in inhospital mortality. During the second 10 years of observation 30-day mortality and 1-year mortality declined as well. Over time a decrease in mortality was seen for both sexes, but men had higher mortality at all measured timepoints throughout the study period, compared with women (Figure 5 and Table 2), which is in accordance with other hip fracture populations and study methods (Kannegaard et al. 2010, von Friesendorff et al. 2016, Pollmann et al. 2019). The gradual improvement in

429

mortality complies well with the overall hip fracture efforts in Denmark, which initially focused on the in-hospital management of hip fractures, and during the past century expanded to include after-care and community-based rehabilitation. An important derived effect of the awareness that high-quality care for hip fracture patients comprises more than choosing the right implant is the implementation of orthogeriatric care. In a prospective cohort study from UK the association between mortality and implementation of orthogeriatric teams, based on 33,152 hip fracture patients, showed hazard ratios of 0.73 (CI 0.62–0.82) and 0.81 (CI 0.75–0.87) for 30 day and 1-year mortality, respectively (Hawley et al. 2016). Strength and limitations This study offers an overview of developments and time trends in a broad spectrum of aspects in hip fracture surgery. Similar studies are difficult to undertake in most other countries. The major strength of this study is the large number of patients free of selection, surveillance, and recall biases. Also, the high reliability of the Danish administrative register system, where registrations are mandatory for departmental reimbursement of health and social care expenses, ensures the trustworthiness of these results. It is reasonable to assume that these results transfer to other populations than the Danish, especially to populations with similar living standards, life expectancy, and access to healthcare. A major limitation in this study is the lack of clinical observations and context, which could allow more detailed analysis and causal interpretations. Other limitations are inherent to the nature of this study and may have led to conservative estimates, for instance only complications that lead to surgery/hospital contact were registered, thus only patients alive, capable, and eligible for secondary surgery appear in this count. Conclusion Over the past 2 decades, a decrease in incidence rate and mortality has been accomplished for the hip fracture population. The median age of first hip fracture was stable throughout the study period despite increasing life expectancy and more comorbidities at the time of fracture. Men remain more vulnerable measured by postoperative mortality compared with women and showed a lower tendency to a decrease in incidence rate of hip fractures. Choice of surgery shifted away from SHS-PI, toward intramedullary nailing and hemi-/arthroplasties for trochanteric and femoral neck fractures, respectively. Further research is warranted to determine causality of the gender difference in incidence rate that we observed in this study as it may play an important role in preventive strategies in the future. Supplementary data Table 1 is available as supplementary data in the online version of this article, http://dx.doi.org/10.1080/17453674.2021. 1923256

430

LRW, SB, and HP developed the study idea and research questions, designed the study, interpreted data, and revised the paper. LRW also obtained and refined data collection, planned and performed statistical analysis, and graphical presentation and drafted the paper. GG provided data access, planned statistical analysis, interpreted data, and revised the paper. Acta thanks Jan-Erik Gjertsen and Björn Rosengrenhelp with peer review of this study.

Abrahamsen B, Vestergaard P. Declining incidence of hip fractures and the extent of use of anti-osteoporotic therapy in Denmark 1997–2006. Osteoporos Int 2010; 21(3): 373-80. doi: 10.1007/s00198-009-0957-3. Ahlborg H G, Rosengren B E, Jarvinen T L, Rogmark C, Nilsson J A, Sernbo I, Karlsson M K. Prevalence of osteoporosis and incidence of hip fracture in women: secular trends over 30 years. BMC Musculoskelet Disord 2010; 11: 48. doi: 10.1186/1471-2474-11-48. Cooper C, Campion G, Melton L J. 3rd. Hip fractures in the elderly: a world-wide projection. Osteoporos Int 1992; 2(6): 285-9. Cooper C, Cole Z A, Holroyd C R, Earl S C, Harvey N C, Dennison E M, Melton L J, Cummings S R, Kanis J A, Epidemiology ICWGoF. Secular trends in the incidence of hip and other osteoporotic fractures. Osteoporos Int 2011; 22(5): 1277-88. doi: 10.1007/s00198-011-1601-6. Egerod I, Rud K, Specht K, Jensen P S, Trangbaek A, Ronfelt I, Kristensen B, Kehlet H. Room for improvement in the treatment of hip fractures in Denmark. Dan Med Bull 2010; 57(12): A4199. Enocson A, Tidermark J, Tornkvist H, Lapidus L J. Dislocation of hemiarthroplasty after femoral neck fracture: better outcome after the anterolateral approach in a prospective cohort study on 739 consecutive hips. Acta Orthop 2008; 79(2): 211-17. doi: 10.1080/17453670710014996. Haentjens P, Johnell O, Kanis J A, Bouillon R, Cooper C, Lamraski G, Vanderschueren D, Kaufman J M, Boonen S, Network on Male Osteoporosis in Europe. Evidence from data searches and life-table analyses for gender-related differences in absolute risk of hip fracture after Colles’ or spine fracture: Colles’ fracture as an early and sensitive marker of skeletal fragility in white men. J Bone Miner Res 2004; 19(12): 1933-44. doi: 10.1359/JBMR.040917. Haugan K, Johnsen L G, Basso T, Foss O A. Mortality and readmission following hip fracture surgery: a retrospective study comparing conventional and fast-track care. BMJ Open 2017; 7(8): e015574. doi: 10.1136/ bmjopen-2016-015574. Hawley S, Javaid M K, Prieto-Alhambra D, Lippett J, Sheard S, Arden N K, Cooper C, Judge A, Group REs. Clinical effectiveness of orthogeriatric and fracture liaison service models of care for hip fracture patients: population-based longitudinal study. Age Ageing 2016; 45(2): 236-42. doi: 10.1093/ageing/afv204. Hjelholt T J, Edwards N M, Vesterager J D, Kristensen P K, Pedersen A B. The positive predictive value of hip fracture diagnoses and surgical procedure codes in the Danish Multidisciplinary Hip Fracture Registry and the Danish National Patient Registry. Clin Epidemiol 2020; 12: 123-31. doi: 10.2147/CLEP.S238722. Kannegaard P N, van der Mark S, Eiken P, Abrahamsen B. Excess mortality in men compared with women following a hip fracture: national analysis of comedications, comorbidity and survival. Age Ageing 2010; 39(2): 203-9. doi: 10.1093/ageing/afp221. Korhonen N, Niemi S, Parkkari J, Sievanen H, Palvanen M, Kannus P. Continuous decline in incidence of hip fracture: nationwide statistics from Finland between 1970 and 2010. Osteoporos Int 2013; 24(5): 1599-603. doi: 10.1007/s00198-012-2190-8.

Acta Orthopaedica 2021; 92 (4): 424–430

Lewiecki E M, Wright N C, Curtis J R, Siris E, Gagel R F, Saag K G, Singer A J, Steven P M, Adler R A. Hip fracture trends in the United States, 2002 to 2015. Osteoporos Int 2018; 29(3): 717-22. doi: 10.1007/s00198017-4345-0. Palm H, Krasheninnikoff M, Holck K, Lemser T, Foss N B, Jacobsen S, Kehlet H, Gebuhr P. A new algorithm for hip fracture surgery: reoperation rate reduced from 18% to 12% in 2,000 consecutive patients followed for 1 year. Acta Orthop 2012; 83(1): 26-30. doi: 10.3109/17453674.2011.652887. Parker M J. Lateral versus posterior approach for insertion of hemiarthroplasties for hip fractures: a randomised trial of 216 patients. Injury 2015; 46(6): 1023-7. doi: 10.1016/j.injury.2015.02.020. Pollmann C T, Rotterud J H, Gjertsen J E, Dahl F A, Lenvik O, Aroen A. Fast track hip fracture care and mortality: an observational study of 2230 patients. BMC Musculoskelet Disord 2019; 20(1): 248. doi: 10.1186/ s12891-019-2637-6. Roche J J, Wenn R T, Sahota O, Moran C G. Effect of comorbidities and postoperative complications on mortality after hip fracture in elderly people: prospective observational cohort study. BMJ 2005; 331(7529): 1374. doi: 10.1136/bmj.38643.663843.55. Rogmark C, Leonardsson O. Hip arthroplasty for the treatment of displaced fractures of the femoral neck in elderly patients. Bone Joint J 2016; 98-B(3): 291-7. doi: 10.1302/0301-620X.98B3.36515. Rosengren B E, Karlsson M K. The annual number of hip fractures in Sweden will double from year 2002 to 2050: projections based on local and nationwide data. Acta Orthop 2014; 85(3): 234-7. doi: 10.3109/17453674.2014.916491. Rosengren B E, Bjork J, Cooper C, Abrahamsen B. Recent hip fracture trends in Sweden and Denmark with age-period-cohort effects. Osteoporos Int 2017; 28(1): 139-49. doi: 10.1007/s00198-016-3768-3. Schemitsch E H, Sprague S, Heetveld M J, Bzovsky S, Heels-Ansdell D, Zhou Q, Swiontkowski M, Bhandari M, FAITH Investigators. Loss of independence after operative management of femoral neck fractures. J Orthop Trauma 2019; 33(6): 292-300. doi: 10.1097/BOT.000000000 0001444. Sheehan K J, Williamson L, Alexander J, Filliter C, Sobolev B, Guy P, Bearne L M, Sackley C. Prognostic factors of functional outcome after hip fracture surgery: a systematic review. Age Ageing 2018; 47(5): 661-70. doi: 10.1093/ageing/afy057. Sköldenberg O, Ekman A, Salemyr M, Boden H. Reduced dislocation rate after hip arthroplasty for femoral neck fractures when changing from posterolateral to anterolateral approach. Acta Orthop 2010; 81(5): 583-7. doi: 10.3109/17453674.2010.519170. Sullivan K J, Husak L E, Altebarmakian M, Brox W T. Demographic factors in hip fracture incidence and mortality rates in California, 2000–2011. J Orthop Surg Res 2016; 11: 4. doi: 10.1186/s13018-015-0332-3. Vestergaard P, Rejnmark L, Mosekilde L. Osteoporosis is markedly underdiagnosed: a nationwide study from Denmark. Osteoporos Int 2005; 16(2): 134-41. doi: 10.1007/s00198-004-1680-8. Vestergaard P, Rejnmark L, Mosekilde L. Increased mortality in patients with a hip fracture: effect of pre-morbid conditions and post-fracture complications. Osteoporos Int 2007; 18(12): 1583-93. doi: 10.1007/s00198007-0403-3. Vochteloo A J, Moerman S, Tuinebreijer W E, Maier A B, de Vries M R, Bloem R M, Nelissen R G, Pilot P. More than half of hip fracture patients do not regain mobility in the first postoperative year. Geriatr Gerontol Int 2013; 13(2): 334-41. doi: 10.1111/j.1447-0594.2012.00904.x. von Friesendorff M, McGuigan F E, Wizert A, Rogmark C, Holmberg A H, Woolf A D, Åkesson K. Hip fracture, mortality risk, and cause of death over two decades. Osteoporos Int 2016; 27(10): 2945-53. doi: 10.1007/ s00198-016-3616-5.

Acta Orthopaedica 2021; 92 (4): 431–435

431

Proton-pump inhibitors are associated with increased risk of prosthetic joint infection in patients with total hip arthroplasty: a case-cohort study Maarten M BRUIN 1,3, Ruud L M DEIJKERS 1, Roos BAZUIN 1, Erika P M ELZAKKER 2, and Bart G PIJLS 1,3 1 Department

of Orthopedic Surgery, HagaZiekenhuis, Den Haag; 2 Department of Medical Microbiology, Hagaziekenhuis, Den Haag; 3 Department of Orthopedic Surgery, LUMC, Leiden, The Netherlands Correspondence: B.G.C.W.Pijls@lumc.nl Submitted 2020-05-10. Accepted 2021-03-19.

Background and purpose — Proton-pump inhibitors (PPI) have previously been associated with an increased risk of infections such as community-acquired pneumonia, gastrointestinal infections and central nervous system infection. Therefore, we evaluated a possible association between proton-pump inhibitor use and prosthetic joint infection (PJI) in patients with total hip arthroplasty (THA), because they can be stopped perioperatively or switched to a less harmful alternative. Patients and methods — A cohort of 5,512 primary THAs provided the base for a case-cohort design; cases were identified as patients with early-onset PJI. A weighted Cox proportional hazard regression model was used for the study design and to adjust for potential confounders. Results — There were 75 patients diagnosed with PJI of whom 32 (43%) used PPIs perioperatively compared with 75 PPI users (25%) in the control group of 302 patients. The risk of PJI was 2.4 times higher (95% CI 1.4–4.0) for patients using PPI. This effect remained after correction for possible confounders. Interpretation — The use of PPIs was associated with an increased risk of developing PJI after THA. Hence, the use of a PPI appears to be a modifiable risk factor for PJI.

One important aspect of the prevention of prosthetic joint infection (PJI) is preoperative optimization of modifiable risk factors (Kunutsor et al. 2016). Medication may be an important category of modifiable risk factors, since they can be temporarily discontinued perioperatively or they can be switched to a less harmful alternative. Proton-pump inhibitors (PPI) are of special interest, because they have been associated with an increased risk of infections such as community-acquired pneumonia, gastrointestinal infections, and central nervous system infections (Lambert et al. 2015, Cunningham et al. 2018, Hung et al. 2018). The increased risk of these infections is probably due to the fact that PPIs decrease the effectiveness of neutrophils (Aybay et al. 1995, Agastya et al. 2000, Zedtwitz-Liebenstein et al. 2002). This increased risk of infection may also apply to total hip arthroplasty (THA), possibly leading to increased risk of PJI. However, the effect of PPIs on the risk of PJI is currently unknown. Therefore, we evaluated a possible association between perioperative PPI use and early-onset prosthetic joint infection in patients with total hip arthroplasty.

Patients and methods This is a case-cohort study. We collected data of patients treated with THA between January 2009 and December 2017 in HAGA hospital in the Netherlands, which is a high-volume teaching hospital. A case-cohort design was chosen to allow efficient assessment of the risk factors (PPI use and confounders) and we used the approach of Cai and Zeng (2004) for

432

sample size considerations. This design provides similar effect estimates and standard errors compared with full cohorts, while at the same time allowing for a high level of detail. Comparable studies evaluating risk factors for PJI with similar sample size were able to detect risk factors (Choong et al. 2007, Dowsey and Choong 2008). Base-cohort and controls 5,512 patients with a primary THA were identified. We excluded patients with hemi-arthroplasty, revision surgery, and THA through an approach other than the direct anterior approach (DAA). All patients received as perioperative prophylaxis either cefazolin for low-risk patients or vancomycin and ciprofloxacin for high-risk patients as per hospital protocol. For every case, we randomly selected 4 controls from the base cohort using a random-number generator. This resulted in the study population of 75 cases and 302 controls. 3 cases were also included as controls, which is normal in case-cohort designs. This phenomenon indicates that the selection of controls was truly random: at baseline (immediately after THA) PJI is not yet diagnosed and some patients will develop PJI postoperatively. Therefore, in a random sample of controls at baseline, the percentage controls who develop PJI should be similar to the incidence of PJI in the whole cohort (Prentice 1986). This was the case in our study: 1.4% (72 of 5,512) is similar to 1% (3 of 302). Cases The cases comprise patients with early-onset PJI. Early-onset PJI was defined as PJI occurring within the first 3 months after surgery (Tande and Patel 2014). The diagnosis of PJI was made according to the major and minor MSIS criteria (Parvizi et al. 2011). To ensure that we included all cases we consulted the Dutch Arthroplasty Register (LROI) to check whether revision for infections or DAIRs (debridement, antibiotics, and implant retention) had been done in other hospitals for patients in the study cohort (van Steenbergen et al. 2015). The Dutch Arthroplasty Register identified no revisions for infections or DAIRs that had been done in other hospitals that we were unaware of for patients in our cohort. Data collection and statistics Data was extracted from the hospital information system HiX (https://chipsoft.com/solutions/532/HiX-the-most-innovativeHIS-EHR) or paper medical records by the researchers and was collected in Castor Electronic Data Capture (https://www. castoredc.com/clinical-data-management-system/). For all patients the demographic data, perioperative use of PPIs, and potential confounders such as vitamin K antagonist use was collected. Preoperative medication use was recorded by anesthetists as part of routine preoperative screening. PJI is a time-to-event outcome and effect of PPI use on PJI risk was analyzed with Kaplan–Meier statistics and weighted Cox proportional hazards regression. We used a weighted

Acta Orthopaedica 2021; 92 (4): 431–435

method according to Barlow et al. (1999) to calculate the hazard ratios (HR) and their 95% confidence interval (CI) (Barlow et al. 1999). Sub-cohort controls are weighted by the inverse of the sampling fraction α (= 302 controls/5,512 entire cohort = 0.055) and the case weight outside the sub-cohort is always 1 at failure. The following weights were thus applied: 1 for a case outside the sub-cohort at failure, 18 (= 1/0.055) for a case in the sub-cohort before failure, 1 for a case in the sub-cohort at failure, 18 for a sub-cohort control. A Kaplan– Meier curve was plotted to ensure that the proportional hazard assumption was not violated. We selected confounders based on the following criteria (Rothman et al. 2008): 1. A confounding factor must be an extraneous risk factor for the disease (i.e., PJI). 2. A confounding factor must be associated with the exposure (i.e., PPI) under study in the source population. 3. A confounding factor must not be affected by the exposure or the disease. In particular it cannot be an intermediate (mediator) step in the causal path between exposure and the disease. Demographic factors such as age, sex, and BMI have been associated with PJI as well as PPI use, so they were considered possible confounders (Pedersen et al. 2010, Hálfdánarson et al. 2018, Antonelli and Chen 2019). Regarding criterion 2, PPIs are prescribed when using NSAIDs, acetylsalicylic acid, certain immunosuppressive drugs, vitamin K antagonists and polypharmacy in some patients (see https://www.farmacotherapeutischkompas.nl/bladeren/ indicatieteksten/maagbescherming). Regarding criterion 1, anticoagulants, immunosuppressive drugs, and polypharmacy have been shown to be risk factors for PJI and do not violate criterion 3, so they were considered possible confounders and included in the model (Pedersen et al. 2010, Antonelli and Chen 2019). NSAIDs have been shown not to be associated with PJI and were therefore not considered a possible confounder (Pedersen et al. 2010). Taken together the following factors met the criteria above and were thus included in the model as possible confounders: age, sex, BMI, acetylsalicylic acid use, use of vitamin K antagonists, immunosuppressive drug use, and polypharmacy. Polypharmacy was defined as the daily use of 5 or more different medications (Masnoon et al. 2017). All analyses were conducted using R package “coxphw” to allow for calculation of robust standard errors (Dunkler et al. 2018). Ethics, data sharing, funding and potential conflicts of interest This case-cohort study was approved by our institutional ethics committee (T17-111) and we comply with the STROBE guidelines for reporting. The data is available upon reasonable request by contacting the corresponding author. The authors received no financial support for the research, and declare no conflict of interests.

Acta Orthopaedica 2021; 92 (4): 431–435

433

Table 1. Patient demographics. Values are count (%) unless otherwise specified Variable

Cases n = 75

Controls n = 302

Age, mean years (SD) Female sex BMI, mean (SD) Obesity (BMI > 30) Proton pump inhibitor Acetylsalicylic acid Vitamin K antagonist Immunosuppressive drugs Polypharmacy a

69 (10) 42 (56) 30 (5.4) 28 (38) 32 (43) 17 (23) 16 (21) 7 (9.3) 37 (49)

68 (11) 188 (62) 27 (4.2) 57 (19) 75 (25) 46 (15) 14 (4.6) 12 (4.0) 109 (36)

Defined as the daily use of ≥ 5 different medications.

Table 2. Univariable weighted Cox proportion hazard regression model Risk factor

HR (95% CI)

Age, years Sex (male) Obesity (BMI > 30) Proton pump inhibitor Acetylsalicylic acid Vitamin K antagonist Immunosuppressive drugs Polypharmacy a

1.0 (1.0–1.0) 0.8 (0.5–1.1) 2.5 (1.6–4.0) 2.4 (1.4–4.0) 1.6 (0.9–2.8) 5.4 (3.1–9.2) 2.4 (1.1–5.3) 1.7 (1.1–2.7)

a Defined as the daily use of ≥ 5 different medications. HR: hazard ratio. CI: confidence interval.

Fraction developing PJI 0.10

Proton-pump inhibitor use yes no

Table 3. Multivariable weighted Cox proportion hazard regression model for PPI 0.08

Factor Proton pump inhibitor use Crude a Proton pump inhibitor use adjusted for Model 1: age Model 2: sex Model 3: BMI Model 4: acetylsalicylic acid use Model 5: vitamin K antagonist use Model 6: immunosuppressive drug use Model 7: polypharmacy.b Model 8: all above a b

HR (95% CI) 0.06

2.4 (1.4–4.0) 2.3 (1.4–4.0) 2.4 (1.3–4.3) 1.9 (1.1–3.3) 2.3 (1.2–4.3) 2.2 (1.0–5.0) 2.4 (1.1–5.4) 2.2 (1.1–4.1) 1.9 (0.4–10)

Crude = HR from univariable model (Table 2). Defined as the daily use of ≥ 5 different medications.

Results There were 75 cases of PJI in 5,512 primary THAs, resulting in an infection rate of 1.4%. The causative micro-organisms were: S. aureus (n = 32), coagulase-negative staphylococci (n = 23), P. aeruginosa (n = 8), E. faecalis (n = 13), E. faecium (n = 1), Enterobacteriaceae (n = 23), streptococci (n = 6), and Corynebacterium ssp (n = 4); the numbers add up to more than 75 cases, because the infection was polymicrobial in 28 hips. The majority of cases were early-onset postoperative: 73 (of 75) cases were treated with a DAIR within 3 months after THA. There were 2 late (acute hematogenous) cases with onset of symptoms less than 4 weeks prior to DAIR procedure. For 74 cases, 2 or more perioperative cultures were positive. The remaining case had 1 positive perioperative culture and the minor MSIS criteria were taken into account. The mean duration between arthroplasty and DAIR was 36 days (SD 90 days). The mean follow-up for the controls was 3.8 years (SD 2.3 years; range 15 to 3,361 days) (Table 1).

0.04

0.02

0 0

Days after surgery

Figure 1. Graph showing risk of prosthetic joint infection (PJI) according to PPI use (weighted 1-minus-survival Kaplan–Meier plot).

Of the 75 patients with PJI, 32 patients (43%) used PPIs perioperatively compared with 75 (25%) in 302 patients in the control group (crude HR 2.4; CI 1.4–4.0; Table 2). After multivariable adjustment, the risk for PJI remained 2 times higher in patients using PPIs perioperatively compared with patients not using PPIs (HR 1.9; CI 0.4–10; Table 3). The Figure shows the risk of PJI according to PPI use (weighted 1-minus-survival Kaplan–Meier plot). In a sensitivity analysis the 2 late acute hematogenous PJI cases were excluded and the results remained similar: HR 2.3 compared with HR 2.4 in the original analysis.

Discussion Prosthetic joint infection in total hip arthroplasty is a severe and challenging complication. Therefore, we think preoperative screening for patients with increased risk, optimizing modifiable risk factors before surgery, and counseling patients is important. In this case-cohort study, we found that the use

434

of PPIs is associated with an increased risk of developing PJI after THA. The incidence of PJI in our base cohort of patients with THA through DAA over a period of 9 years was 1.4%. This is within range of reported infection rates for the DAA from other articles (Aggarwal et al. 2019). We are not aware of other clinical studies describing an association between PPIs and an increased risk of PJI. Also, no clear mechanism for an increase in the incidence of PJI has been described for PPI use. However, several articles describe the impact of PPI use on the immune system. Agastya et al. (2000) reported that PPIs might suppress the innate immune responses by interfering with the functionality of the neutrophils. Also, Liu et al. (2013) found PPIs might inhibit the activity of lysosomal enzymes and alter enzyme functions. Zedtwitz-Liebenstein et al. (2002) designed an experiment where human volunteers received a single dose of omeprazole resulting in decreased bactericidal activity of the neutrophils. The innate immune response and neutrophils have an important role in the host defense response against bacteria. Chronic treatment with PPIs could make patients more susceptible to bacterial infections due to the impaired immune response (Hung et al. 2018). Malnutrition caused by PPI use may be an alternative mechanism for the observed increased risk of PPIs for PJI. Malnutrition is described as a risk factor for PJI and it is also associated with delayed wound healing, persistent wound drainage, and increased susceptibility to infections (Baek 2014, Pruzansky et al. 2014, Rezapoor and Parvizi 2015). In the study by Kinoshita et al. (2018), PPIs were identified as a possible cause for hypomagnesemia, calcium deficiency, and low vitamin B12. However, it is unclear if and how this effects wound healing and the risk of PJI. Our study has several limitations. 1st, there is a slight possibility that we have missed early PJI despite our study design and despite consulting the Dutch Arthroplasty Registry, which can be considered non-differential misclassification (Rothman et al. 2008, van Steenbergen et al. 2015, Veltman et al. 2018). In most situations, non-differential misclassification of a binary disease will produce bias towards the null (no effect) (Rothman et al. 2008). This means that if we were to have missed cases (for instance acute hematogenous infections) our estimates for the risk factors would be on the conservative side. Therefore, missing cases would not lead to false identification or overestimation of risk factors for early PJI. 2nd, adjusting for all confounders, model 8 (see Table 3) showed a wider 95% CI and included 1.0; this is limited by the sample size. With an increase in the sample size, the 95% CI would become narrower, while the effect size would stay relatively constant (Lee 2016). 3rd, due to the observational design the observed effect between the use of PPI and the development of PJI should be interpreted as an association, so further research is necessary to determine possible causality (Grimes and Schulz 2002). At present, the potential benefit of temporarily stopping PPIs or switching to another medication group

Acta Orthopaedica 2021; 92 (4): 431–435

(e.g., histamine [H2] blockers or antacids) should be weighed against the risk of PJI on an individual basis. In conclusion, the results of our case-cohort study showed that the use of PPIs is associated with an increased risk of developing PJI after THA. Hence, the use of PPIs appears to be a modifiable risk factor for PJI.

The study was designed and coordinated by MMB, BGP, RLMD, EPME, and RB. Data collection and patient inclusion were performed by MMB, BGP, RB, RLMD, and EPME. Statistical analysis was done by MMB and BGP. BGP, RLMD, and MMB interpreted the data and wrote the initial draft manuscript. All authors critically revised the manuscript.

Acta thanks Martin Clauss and Ricardo Sousa for help with peer review of this study.

Agastya G, West B C, Callahan J M. Omeprazole inhibits phagocytosis and acidification of phagolysosomes of normal human neutrophils in vitro. Immunopharmacol Immunotoxicol 2000; 22(2): 357-72. Aggarwal V K, Weintraub S, Klock J, Stachel A, Phillips M, Schwarzkopf R, Iorio R, Bosco J, Zuckerman J D, Vigdorchik J M, Long W J. 2019 Frank Stinchfield Award: A comparison of prosthetic joint infection rates between direct anterior and non-anterior approach total hip arthroplasty: a single institution experience. Bone Joint J 2019; 101-B(6_Supple_B): 2-8. Antonelli B, Chen A F. Reducing the risk of infection after total joint arthroplasty: preoperative optimization. Arthroplasty 2019; 1(1): 4. Aybay C, Imir T, Okur H. The effect of omeprazole on human natural killer cell activity. General Pharmacology: Vascul Pharmacol 1995; 26(6): 141318. Baek S-H. Identification and preoperative optimization of risk factors to prevent periprosthetic joint infection. WJO 2014; 5(3): 362. Barlow W E, Ichikawa L, Rosner D, Izumi S. Analysis of case-cohort designs. J Clin Epidemiol 1999; 52(12): 1165-72. Cai J, Zeng D. Sample size/power calculation for case-cohort studies. Biometrics 2004; 60(4): 1015-24. Choong P F M, Dowsey M M, Carr D, Daffy J, Stanley P. Risk factors associated with acute hip prosthetic joint infections and outcome of treatment with a rifampin based regimen. Acta Orthop 2007; 78(6): 755-65. Cunningham R, Jones L, Enki D G, Tischhauser R. Proton pump inhibitor use as a risk factor for Enterobacteriaceal infection: a case-control study. J Hosp Infect 2018; 100(1): 60-4. Dowsey M M, Choong P F M. Obesity is a major risk factor for prosthetic infection after primary hip arthroplasty. Clin Orthop Relat Res 2008; 466(1): 153-8. Dunkler D, Ploner M, Schemper M, Heinze G. Weighted Cox regression using the R package coxphw. J Stat Soft 2018; 84(2). Grimes D A, Schulz K F. Bias and causal associations in observational research. Lancet 2002; 359(9302): 248-52. Hálfdánarson Ó Ö, Pottegård A, Björnsson E S, Lund S H, Ogmundsdottir M H, Steingrímsson E, Ogmundsdottir H M, Zoega H. Proton-pump inhibitors among adults: a nationwide drug-utilization study. Therap Adv Gastroenterol 2018; 11: 175628481877794. Hung W-T, Teng Y-H, Yang S-F, Yeh H-W, Yeh Y-T, Wang Y-H, Chou M-Y, Chou M-C, Chan C-H, Yeh C-B. Association between proton pump inhibitor use and CNS infection risk: a retrospective cohort study. JCM 2018; 7(9): 252. Kinoshita Y, Ishimura N, Ishihara S. Advantages and disadvantages of long-term proton pump inhibitor use. J Neurogastroenterol Motil 2018; 24(2): 182-96.

Acta Orthopaedica 2021; 92 (4): 431–435

Kunutsor S K, Whitehouse M R, Blom A W, Beswick A D, INFORM Team. Patient-related risk factors for periprosthetic joint infection after total joint arthroplasty: a systematic review and meta-analysis. PLoS ONE 2016; 11(3): e0150866. Lambert A A, Lam J O, Paik J J, Ugarte-Gil C, Drummond M B, Crowell T A. Risk of community-acquired pneumonia with outpatient proton-pump inhibitor therapy: a systematic review and meta-analysis. PLoS ONE 2015; 10(6): e0128004. Lee D K. Alternatives to P value: confidence interval and effect size. Korean J Anesthesiol 2016; 69(6): 555. Liu W, Baker S S, Trinidad J, Burlingame A L, Baker R D, Forte J G, Virtuoso L P, Egilmez N K, Zhu L. Inhibition of lysosomal enzyme activities by proton pump inhibitors. J Gastroenterol 2013; 48(12): 1343-52. Masnoon N, Shakib S, Kalisch-Ellett L, Caughey G E. What is polypharmacy? A systematic review of definitions. BMC Geriatr 2017; 17(1): 230. Parvizi J, Zmistowski B, Berbari E F, Bauer T W, Springer B D, Della Valle C J, Garvin K L, Mont M A, Wongworawat M D, Zalavras C G. New definition for periprosthetic joint infection: from the Workgroup of the Musculoskeletal Infection Society. Clin Orthop Relat Res 2011; 469(11): 2992-4. Pedersen A B, Svendsson J E, Johnsen S P, Riis A, Overgaard S. Risk factors for revision due to infection after primary total hip arthroplasty: a population-based study of 80,756 primary procedures in the Danish Hip Arthroplasty Registry. Acta Orthop 2010; 81(5): 542-7.

435

Prentice R L. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 1986; 73(1): 1-11. Pruzansky J S, Bronson M J, Grelsamer R P, Strauss E, Moucha C S. Prevalence of modifiable surgical site infection risk factors in hip and knee joint arthroplasty patients at an urban academic hospital. J. Arthroplasty 2014; 29(2): 272-6. Rezapoor M, Parvizi J. Prevention of periprosthetic joint infection. J. Arthroplasty 2015; 30(6): 902-7. Rothman K, Greenland S, Lash T. Modern epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008. van Steenbergen L N, Denissen G A W, Spooren A, van Rooden S M, van Oosterhout F J, Morrenhof J W, Nelissen R G H H. More than 95% completeness of reported procedures in the population-based Dutch Arthroplasty Register: external validation of 311,890 procedures. Acta Orthop 2015; 86(4): 498-505. Tande A J, Patel R. Prosthetic joint infection. Clin Microbiol Rev 2014; 27(2): 302-45. Veltman E S, F D J, Nelissen R G, Poolman R W. Antibiotic prophylaxis and DAIR treatment in primary total hip and knee arthroplasty, a national survey in the Netherlands. J Bone Joint Infect 2018; 3(1): 5-9. Zedtwitz-Liebenstein K, Wenisch C, Patruta S, Parschalk B, Daxböck F, Graninger W. Omeprazole treatment diminishes intra- and extracellular neutrophil reactive oxygen production and bactericidal activity. Crit Care Med 2002; 30(5): 1118-22.

436

Acta Orthopaedica 2021; 92 (4): 436–442

Cost utility analysis of intramedullary nailing and skeletal traction treatment for patients with femoral shaft fractures in Malawi Linda CHOKOTHO 1,2, Claire A DONNELLEY 3, Sven YOUNG 1,4,5, Brian C LAU 6, Hao-Hua WU 3, Nyengo MKANDAWIRE 1,7, Jan-Erik GJERTSEN 2,4, Geir HALLAN 2,4, Kiran J AGARWAL-HARDING 8, and David SHEARER 3 1 Department

of Surgery, College of Medicine, University of Malawi; 2 Department of Clinical Medicine, University of Bergen, Bergen, Norway; 3 Institute for Global Orthopedics and Traumatology, Orthopedic Trauma Institute, University of California San Francisco, San Francisco, CA, USA; 4 Department of Orthopedic Surgery, Haukeland University Hospital, Bergen, Norway; 5 Department of Surgery, Kamuzu Central Hospital, Lilongwe, Malawi; 6 Department of Orthopedic Surgery, Duke University Medical Centre, Durham, NC, USA; 7 School of Medicine, Flinders University, Adelaide, Australia; 8 Harvard Global Orthopaedics Collaborative, Harvard Combined Orthopaedic Residency Program, Massachusetts General Hospital, Boston, MA, USA Correspondence: lindachokotho@gmail.com Submitted 2020-11-14. Accepted 2021-02-10.

Background and purpose — In Malawi, both skeletal traction (ST) and intramedullary nailing (IMN) are used in the treatment of femoral shaft fractures, ST being the mainstay treatment. Previous studies have found that IMN has improved outcomes and is less expensive than ST. However, no cost-effectiveness analyses have yet compared IMN and ST in Malawi. We report the results of a cost-utility analysis (CUA) comparing treatment using either IMN or ST. Patients and methods — This was an economic evaluation study, where a CUA was done using a decision-tree model from the government healthcare payer and societal perspectives with an 1-year time horizon. We obtained EQ5D-3L utility scores and probabilities from a prospective observational study assessing quality of life and function in 187 adult patients with femoral shaft fractures treated with either IMN or ST. The patients were followed up at 6 weeks, and 3, 6, and 12 months post-injury. Quality adjusted life years (QALYs) were calculated from utility scores using the area under the curve method. Direct treatment costs were obtained from a prospective micro costing study. Indirect costs included patient lost productivity, patient transportation, meals, and childcare costs associated with hospital stay and follow-up visits. Multiple sensitivity analyses assessed model uncertainty. Results — Total treatment costs were higher for ST ($1,349) compared with IMN ($1,122). QALYs were lower for ST than IMN, 0.71 (95% confidence interval [CI] 0.66– 0.76) and 0.77 (CI 0.71–0.82) respectively. Based on lower cost and higher utility, IMN was the dominant strategy. IMN remained dominant in 94% of simulations. IMN would be less cost-effective than ST at a total procedure cost exceeding $880 from the payer’s perspective, or $1,035 from the societal perspective.

Interpretation — IMN was cost saving and more effective than ST in the treatment of adult femoral shaft fractures in Malawi, and may be an efficient use of limited healthcare resources.

The incidence of femoral shaft fractures in low- and middleincome countries (LMICs) is estimated to range from 16 to 46 per 100,000 people per year (Agarwal-Harding et al. 2015). In Malawi, which has 17.5 million inhabitants (National Statistical Office of Malawi 2019), a recent study estimated the prevalence of femoral shaft fractures at 1.4 per 100,000 people and incidence of 27 per 100,000 people per year (AgarwalHarding et al. 2020), translating into approximately 4,700 fractures annually. For comparison, the annual incidence of femoral shaft fractures in Sweden is one third of that in Malawi (personal communication, Michael Möller, The Swedish Fracture Registry). The goal of treatment for these fractures is to achieve stability at the fracture site, thereby promoting union and painless weight-bearing, and allowing early patient rehabilitation. Treatment with intramedullary nailing (IMN) achieves this goal earlier and more consistently than skeletal traction (ST), and has become the gold standard for managing these fractures in high-income countries. In Malawi, however, treatment using ST, requiring patient immobilization in bed for at least 6 weeks, remains the mainstay treatment. Femoral shaft fractures do not only affect physical function, but also the patient’s social and psychological well-being (Haug et al. 2017, Kohler et al. 2017). Accordingly, better treatment of these fractures should improve quality of life by improving not only physical function but also social and psychological functions. A quality adjusted life year (QALY) is

Acta Orthopaedica 2021; 92 (4): 436–442

an appropriate measure of outcome as it includes both quantity and quality of life (Stothers 2006). Studies from Malawi and elsewhere have found that treatment with IMN is less costly compared with ST (Gosselin et al. 2009, Opondo et al. 2013, Kamau et al. 2014, Diab et al. 2019). However, these studies did not assess the effectiveness of these 2 treatment modalities using a generic outcome measure such as the QALY. As such it remains unclear which modality represents a better use of limited healthcare resources in terms of costs and QALYs gained. Malawi is a low-income country in Southern Africa with a gross domestic product (GDP) per capita of only US$ 380 (World Bank 2019a). In a resource-limited setting like Malawi, appropriate resource allocation to ensure optimization of the healthcare budget is a priority. Cost-effectiveness analyses of health care interventions can provide the necessary evidence needed to change clinical practice, funding, and policies for the better. We evaluated the cost-effectiveness of IMN versus ST in the treatment of femoral shaft fractures in Malawi using QALYs as a measure of effectiveness, to determine which treatment modality best represents efficient use of healthcare resources from government healthcare payer and societal perspectives.

Patients and methods Design and setting This study is a cost-utility analysis (CUA) comparing IMN with ST for treatment of adult femoral shaft fractures in Malawi. This was a planned analysis using data from a previously published prospective observational study that compared quality of life (QOL) and function for adults with closed femoral shaft fractures treated with IMN or ST in Malawi (Chokotho et al. 2020). Adult patients were recruited from 6 hospitals in Malawi: Queen Elizabeth Central Hospital (QECH), Kamuzu Central Hospital, Beit Cure International Hospital, and Chiradzulu, Thyolo, and Chikwawa district hospitals. Patients were excluded using the following criteria: (1) age less than 18 (n = 4), (2) polytrauma or multiple injuries, defined as any additional injury requiring admission on its own merits (admission was used as a proxy for severity of injury since it was not possible to calculate injury severity scores; n = 7), (3) pathological fractures (n = 2), (4) open fractures (n = 5), (5) clinical evidence of infection at the surgical site before or during surgery (n = 1), and (6) prior surgery involving the affected femur (n = 1). Study participants were treated using ST or IMN at the discretion of the treating surgeon or orthopedic clinical officer (OCO). OCOs are non-physician clinicians trained to provide nonoperative care for orthopedic conditions and emergency orthopedic surgery for selected cases, such as acute infections and open fractures (Mkandawire et al. 2008). Follow-up assessments for both groups were performed at 6 weeks, 3 months, 6 months, and 1 year after the injury. At each follow-

437

up, the patients were assessed clinically, the EQ-5D-3L was administered, and radiographs taken when feasible. Treatment technique The SIGN nail (Zirkle and Shahab 2016) was used in all IMN patients. This is a solid locking IM nail that can be inserted without the need for a fracture table or intraoperative fluoroscopy. The SIGN nail was inserted antegrade using open reduction, with the patient in the lateral position on a standard operating table. All ST patients had straight leg extension skeletal traction with a Steinmann pin inserted into the proximal tibia under local anesthesia, using an aseptic technique and a stirrup to connect the rope that was used to position the weights. The weights were positioned either using a bar or pulleys, or by placing the rope directly over the end of the bed, depending on the type of bed and equipment available at the hospital. Effectiveness data We measured the effectiveness of each treatment strategy using quality-adjusted life years (QALYs) based on the EQ5D-3L (Rabin and Charro 2001). At each follow-up time point, research assistants administered the EQ-5D-3L questionnaire to the study participants. The EQ-5D-3L is a tool used to measure health-related quality of life (HRQoL) that has been translated to Chichewa and validated for use in Malawian orthopedic patients (Chokotho et al. 2017). Utility scores were calculated using EQ-5D-3L responses based on data from the Zimbabwean population value set (Jelsma et al. 2003). QALYs were calculated from the utility scores using the area under the curve (AUC) method (Billingham et al. 1999). The AUC was calculated by multiplying the EQ-5D score at each time point by the midpoint duration between follow-up visits to capture both pre- and post-visit health states. We calculated QALYs for each of the 4 health states (successful IMN, unsuccessful IMN, successful skeletal traction, and unsuccessful skeletal traction). There was no measurable difference between groups in EQ-5D-3L index score at 1 year after treatment, hence a 1-year time horizon was used. Costing data Direct medical and overhead costs and indirect patient costs were estimated. Direct costs were estimated using time-andmotion analysis and included procedure personnel and supplies; ward personnel; medications and investigations; surgical implants; and instruments. Even though the SIGN nail is distributed free of charge in LMICs, to be more conservative in our economic analysis we included the hardware production cost, which was obtained directly from the manufacturer. The figure includes both manufacturing and distribution costs. Overhead costs included food, building maintenance, renovation, cleaning and sanitation, bedding, stationery, uniforms, protective wear, and staff training. The direct medical

438

Acta Orthopaedica 2021; 92 (4): 436–442

Table 1. Resource utilization for each treatment group including both direct and indirect costs excluding the cost of failed traction or surgery requiring reoperation expressed as US$ Factor

IMN

Direct costs Inpatient Ward personnel 264 445 Overhead 116 196 Surgical implants 136 0.0 Investigations 38 288 Procedure personnel 24 2.7 Procedure supplies 8.5 3.6 Instruments 8.7 0.2 Medications 2.2 1.6 Total 597 678 Outpatient Clinic personnel 1.0 0.8 Physiotherapy personnel 2.6 1.2 Radiography 41 29 Total 45 31 Total direct costs 642 709 Indirect costs Lost productivity 454 610 Transportation 4.0 4.0 Meals 3.4 0.7 Childcare costs 20 25 Total indirect costs 480 640 Total costs (direct + indirect costs) 1,122 1,349

Table 2. Inputs for the decision-tree model Distribution Factor Mean (95% CI) SE type

Ref. a

Costs IMN Direct inpatient cost 597 (543–651) 27 Gamma D Direct outpatient cost 45 – – – C b Cost of reoperation 900 (600–1,200) – Uniform Indirect cost (societal) 480 – – – C ST Direct inpatient cost 678 (624–732) 28 Gamma D Direct outpatient cost 31 – – – C Cost of IMN after failed ST 649 (563–735) 49 Gamma D Indirect cost (societal) 640 – – – C Utilities IMN Utility successful IMN 0.77 (0.72–0.83) 0.03 Beta C Utility after reoperation 0.71 (0.61–0.81) 0.05 Beta E ST Utility successful ST 0.72 (0.66–0.79) 0.03 Beta C Utility failed ST 0.69 (0.61–0.76) 0.04 Beta C Probabilities (%) IMN Probability reoperation 1.8 (0.0–5.3) 0.02 Beta C ST Probability failed ST 30 (22–38) 0.04 Beta C a C – Chokotho et al. 2020; D – Diab et b Assumed reoperation cost 1- to 2-fold

al. 2019; E – Eliezer et al. 2017 higher than index surgery.

and overhead cost data was collected on a subset Successful ST cST \ uSTsuccess of patients in the main clinical study at 1 of the 6 # ST sites (QECH) (Diab et al. 2019). Hourly salaries Conversion to IMN for personnel were calculated by dividing mean [cST+cIMNconvert] pFailedST Femur fracture \ uSTfail annual salary by the product of 9-hour days, which Uncomplicated is the average working hours for public hospitals cIMN \ uIMNsuccess # in Malawi, and 251 working days per year. FurEarly IMN ther details on how the direct costs were calculated Reoperation required [cIMN+cReop] \ uIMNfail have been published earlier (Diab et al. 2019). pReoperation All costs were presented in 2017 US$. Outpatient costs included clinic personnel, physiotherapy, and Figure 1. Decision-tree model of possible outcomes after ST and IMN treatment of femoral shaft fractures. Costs and effectiveness of each pathway are presented at radiography costs. Indirect costs included patient the end of each potential pathway. lost productivity, and patient transportation, meals, and childcare costs associated with hospital stay and follow-up visits. We calculated costs associated with lost Decision-tree model productivity for patients who reported either formal or infor- We constructed a simple decision-tree model (Figure 1) mal employment prior to injury. Employment was scored to compare the 2 treatments using TreeAge Pro 2020 (Pro as a binary value at each follow-up time point. Using mid- 2019). In the ST treatment strategy, there were 2 potential points between follow-up visits before and after the follow- outcomes: (1) successful traction, or (2) failure of treatment up time point, overall lost productivity was weighted by the with conversion to IMN. Successful traction was defined sum of weeks of reported unemployment, with a maximum as complete fracture union after treatment with ST. Failof 52 weeks. The costs associated with productivity loss were ure of ST treatment was defined as either delayed union calculated using a standardized wage for Malawi, adjusted or non-union of the fracture requiring conversion to IMN. using purchasing power parity to US$ (World Bank 2019b). Patients treated in the IMN group had 2 potential outcomes: Patients were interviewed to estimate transportation, meal, (1) successful IMN, or (2) failure of treatment with reopand childcare costs. Resource utilization for each treatment eration. The diagnosis of delayed union was made by the group is given in Table 1. treating clinician if, at 6 weeks or more post-injury, there

Acta Orthopaedica 2021; 92 (4): 436–442

Table 3. Output from decision-tree model including incremental cost (US$) and effectiveness with 95% confidence intervals Skeletal traction Utility 0.71 (0.66–0.76) Payer cost 903 (828–986) Societal cost 1,543 (1,469–1,625) Early IMN Utility 0.77 (0.71–0.82) Payer cost 659 (599–729) Societal cost 1,139 (1,080–1,210) Incremental utility 0.06 Incremental payer cost –244 Incremental societal cost –404

439

Utility of successful IMN (0.72–0.83) Utility of successful traction (0.79–0.66) Utility of failed traction converted to IMN (0.76–0.61) Cost of IMN (543–651) Cost of traction (732–624) Outpatient cost of IMN (0–90) Probability of failed traction (0.378–0.222) Outpatient cost of traction (62–0) Probability of reoperation (0–0.053) Utility of IMN requiring reoperation (0.61–0.81) Cost of reoperation (600–1,200) –4x105

–3x105

–2x105

–1x105

ICER

Figure 2. Tornado diagram demonstrating influence of each variable on the ICER across a plausible range of inputs based on the upper and lower bound of 95% confidence interval. Dotted line represents ICER per QALY gained for the base case.

was still tenderness and mobility at the fracture site and no radiological evidence of callus formation. Non-union was defined as no evidence of fracture healing both clinically and radiologically after at least 3 months on ST or 6 months after IMN. The primary outcome of the analysis was the incremental cost-effectiveness ratio (ICER), which was calculated by dividing the difference in cost by the difference in utility between the 2 treatment groups. The inputs for the model are given in Table 2. There was no difference in EQ-5D-3L index scores at 1 year in the primary study, hence a 1-year time horizon was used. Although typically 3% discounting would be applied, because of the 1-year time horizon we did not apply discounting. Both payer and societal perspectives were considered in the base case.

Ethics, funding, and potential conflicts of interest The study was approved by the College of Medicine Research Ethics Committee in Malawi, and the Western Norway Regional Research Committee and University of California San Francisco Institutional Review Boards. Written informed consent was obtained from all patients in the study. The study was funded by the James O. Johnston Research Grant, a PhD grant by Norad through the Norhed Project, and the Institute of Global Orthopedics and Traumatology (IGOT), University of California San Francisco. Author DS is a non-paid member of the Board of Directors for SIGN Fracture Care International. The rest of the authors declare no conflicts of interest.

Sensitivity analysis We made both deterministic and probabilistic sensitivity analyses to assess which parameters are most important for the ICER and the uncertainty of the ICER input parameters. A tornado diagram was used to perform multiple 1-way sensitivity analyses assessing the relative influence of each model input on the ICER across a range of plausible input values based on the upper and lower limits of 95% confidence intervals (CI). 1-way sensitivity analyses were presented independently where appropriate. A multivariate probabilistic sensitivity analysis (PSA) was completed by performing 10,000 iterations of the model with a unique value for each input drawn from a probability distribution. The distributions used and standard errors are shown in Table 2. In general, costs were represented using a gamma distribution (range 0 to ∞) while probabilities and utilities were represented with a beta distribution (range 0 to 1). The results of the PSA are presented as an ICER scatter plot, which visually demonstrates the outcome of each iteration of the PSA as a point on the cost-effectiveness plane.

We used data from 187 patients who completed 1-year followup to estimate utilities and probabilities, including 55 cases treated with a SIGN intramedullary nail (IMN) and 132 cases treated with ST. The overall total QALYs at 1 year were higher after IMN compared with ST (Table 3). We used data on a subset of 65 patients treated at QECH (38 IMN, 27 ST) to estimate direct costs. The total direct cost of treatment was higher in the ST group compared with IMN (see Table 3). The total societal cost was higher for ST ($1,543; CI $1,149–$1,625) than IMN (1,139; CI $1,080– $1,210). Based on higher costs from both payer and societal perspectives, and lower utility with ST, IMN was the dominant strategy.

Results

Sensitivity analysis The Tornado diagram (Figure 2) shows that the ICER was most sensitive to effectiveness of successful IMN, followed by effectiveness of successful traction. No change in the range of values for any of the variables resulted in IMN being less cost-effective than ST.

440

Acta Orthopaedica 2021; 92 (4): 436–442

ICER x103 Lorem ipsum Lorem ipsum

Incremental cost

ICER x103

Lorem ipsum Lorem ipsum

Early IMN Skeletal traction

12 10

Early IMN Skeletal traction

40 80 120

160

200

–2

240

–4

280

–6

320

–8

360

–10

–12 500 600 700 800 900 1,000 1,100 1,200 1,300 1,400 1,500

Payer total cost of IMN

Societal total cost of IMN

Figure 3. 1-way sensitivity analysis on the payer total cost of early intramedullary nailing and ICER.

Figure 4. 1-way sensitivity analysis on the societal total cost of early intramedullary nailing and ICER.

400

Figures 3 and 4 show the 1-way sensitivity analysis varying the total cost of IMN on ICER from the payer and societal perspectives respectively. IMN was dominant (more effective, less costly) up to a total procedure cost of $880 from the payer perspective or $1,035 from the societal perspective. Focusing specifically on the cost of the intramedullary implant, surgery was cost saving up to a nail cost of $472 from the payer perspective or $691 from a societal perspective. Probabilistic sensitivity analysis The ICER scatter plot (Figure 5) shows that IMN was cost saving and more effective (dominant) in 93.8% of simulations.

Discussion This study found that treatment of adult femoral shaft fractures with IMN was more cost-effective than with ST in Malawi. Sensitivity analyses showed more than 90% certainty that this conclusion is true, and will remain true for IMN procedural costs of less than $880 and $1,035 from the payer and societal perspectives, respectively. Although there were no substantial differences in effectiveness between treatment modalities at 1 year, there were small differences at the other time intervals (Chokotho et al. 2020). The cost of IMN was lower and utility higher compared with ST; IMN is therefore the dominant approach from both societal and payer perspectives. The finding of lower cost of IMN compared with ST has been reported by previous studies. Gosselin et al. (2009) found lower costs for IMN compared with ST, even after accounting for re-nailing costs following infection or non-union. Gosselin also reported better union rates with IMN than ST. Similarly, both Opondo et al. (2013) and Kamau et al. (2014) found IMN to be less costly with better healing and functional outcomes than ST among patients with femoral shaft fractures in Kenya. However, the time horizon in these studies ranged from 12 to

–0.06 –0.04 –0.02 0

0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18

Incremental effectiveness

Figure 5. ICER scatter plot demonstrating output from the probabilistic sensitivity analysis.

16 weeks, limiting the assessment of non-union. Further, there was no measure of patient-reported outcomes or preferenceweighted instruments, such as the EQ-5D, and cost was measured only from the payer perspective. While our study found similar EQ-5D scores between the 2 groups at 1-year followup, a high percentage (30%) of the patients treated with ST required conversion to IM nailing due to either delayed union or non-union (Chokotho et al. 2020). Had these patients not converted to IM nailing, it is likely that they would have had substantially worse EQ-5D scores at 1-year, meaning this conversion likely mitigated some of the negative effects as patients were “rescued” from skeletal traction complications by conversion to IMN. In this sense, the conversion biased the effectiveness estimate towards the null hypothesis. Patients treated with ST in Malawi are normally admitted to hospital for at least 6 weeks whereas those treated with IMN have an average length of stay of 17 days (Diab et al. 2019). In Malawi, patients do not pay service fees in public hospitals, therefore prolonged hospital stay is likely to have cost implications from the governmental payer perspective. A treatment method like IMN, which is both cost-saving and more effective, is certainly worth prioritizing to optimize the limited health budget. Prolonged hospital stay is also likely to have financial implications for the patients and their guardians, who usually accompany patients in the hospital during the entire admission period. Due to the lack of nursing staff in Malawi, it is customary for these guardians, who are typically family members, to serve as the primary caregiver for patients during their hospitalization, with both patients and caregivers incurring substantial indirect costs of lost productivity, and hospital-related expenses. This is the first study that has evaluated the costeffectiveness of femoral shaft fracture treatment with IMN and ST from the societal perspective. Haug et al. (2017) found that patients treated with skeletal traction complained that prolonged hospitalization caused severe financial strain because

Acta Orthopaedica 2021; 92 (4): 436–442

patients and their families were unable to engage in incomegenerating activities. In addition, they found that there was increased out-of-pocket expenditure while in hospital. Survival mechanisms to keep up with the increased expenditure included selling their property and borrowing money, sometimes with high interest rates (Damme et al. 2004, Kohler et al. 2017). Therefore, if hospital stays can be reduced through IMN, this treatment has cost-saving potential from both the governmental payer and the societal perspective. SIGN Fracture Care International currently donates intramedullary nails free of charge to many hospitals in LMICs, including Malawi. This fact increases the potential cost saving beyond this study’s estimates, since our analysis included the cost of the IM nail. Cost-effective interventions are, however, not always affordable and accessible, and such is the case for Malawi where provision of operative fracture treatment is not universal in public hospitals. Future studies should include budget impact analyses assessing the affordability of adopting a new intervention from the payer’s perspective (Sullivan et al. 2014), thereby evaluating the opportunity costs and relevant benefits associated with choosing IM nailing as first-line treatment over ST. Our study had several limitations. First, as it was not a randomized study, there were likely unmeasured confounding variables. However, the body of evidence supporting IM nailing (Amihood 1973, Gosselin et al. 2009, Opondo et al. 2013, Kamau et al. 2014), would likely make an RCT unethical to perform. Second, loss to follow-up at the different time points could lead to selection bias, thereby affecting our findings. However, there was no differential loss to follow-up as the proportions in both groups were similar. The results in our model were validated by univariate sensitivity analysis and probabilistic sensitivity analysis and both analyses showed that IMN was the cost-effective treatment approach. Third, the time horizon of 1 year used in this study may have been too short. As such we may have missed long-term QALY gains. However, effectiveness was similar between the 2 treatment groups at 1 year, likely because those who failed ST treatment, and were likely to have a poor outcome if left untreated, were switched to treatment with IMN. Conversely, in a setting where IMN is not offered, ST is likely to result in substantial loss of QALYs. Another limitation is that the decision tree in our analysis did not include all possible pathways or complications that represent the course of outcomes after treatment. Only delayed union and non-union were considered because cost data for other complications was not available. The majority of patients in this study were recruited from governmentrun hospitals, and so the findings may not be applicable to patients treated in private care facilities. However, ST treatment is not routinely offered in private hospitals, where all patients are treated with IMN, and the majority of the population in Malawi does not have medical insurance and therefore uses public hospitals where services are free at point of care.

441

Thus, our findings are applicable to the majority of health facilities in the country. In conclusion, despite its limitations, our study has shown that IMN is more effective and costs less than ST, and therefore scale-up of IMN may be an efficient use of limited healthcare resources in low-income countries. Our findings are relevant to healthcare policymakers and other stakeholders to justify and advocate for improved surgical capacity so that patients with femoral shaft fractures are treated with intramedullary nailing rather than skeletal traction.

LC designed the concept of study, analyzed the data, and drafted the manuscript. CD analyzed the data and undertook critical revision of the manuscript. SY provided input on the concept of the study, and undertook critical revision of the manuscript. HHW undertook critical revision of the manuscript. BCL, NM, JEG, and GH provided input on the concept of the study, and undertook critical revision of the manuscript. KJAH designed and implemented the indirect costs data collection and undertook critical revision of the manuscript, DS provided input on the concept of the study, analyzed the data, and undertook critical revision of the manuscript. The authors would like to thank Dr Mohamed Mustafa Diab for his contribution to the costing study that estimated the direct cost of intramedullary nailing and skeletal traction in treatment of femoral shaft fractures in Malawi using time-and-motion analysis. They would also like to thank Mr Foster Mbomuwa, who was the project coordinator, for his efforts. Acta thanks Richard A Gosselin and Henry E Rice for help with peer review of this study.

Agarwal-Harding K J, Meara J G, Greenberg S L, Hagander L E, Zurakowski D, Dyer G S. Estimating the global incidence of femoral fracture from road traffic collisions: a literature review. J Bone Joint Surg Am 2015; 97(6): e31. Agarwal-Harding K J, Chokotho L, Young S, Mkandawire N, Losina E, Katz J N. A nationwide survey investigating the prevalence and incidence of adults with femoral shaft fractures receiving care in Malawian district and central hospitals. East Cent Afr J Surg 2020; 25(3). Amihood S. Analysis of 200 fractures of the femoral shaft treated at Groote Schuur Hospital. Injury 1973; 5(1): 35-40. Billingham L, Abrams K, Jones D. Methods for the analysis of qualityof-life and survival data in health technology assessment. Health Technol Assess 1999; 3(10)1-152. Chokotho L, Mkandawire N, Conway D, Wu H H, Shearer D Techno, Hallan G, Gjertsen J E, Young S, Lau B C. Validation and reliability of the Chichewa translation of the EQ-5D quality of life questionnaire in adults with orthopedic injuries in Malawi. Malawi Med J 2017; 29(2): 84-8. Chokotho L, Wu H H, Shearer D, Lau B C, Mkandawire N, Gjertsen J E, Hallan G, Young S. Outcome at 1 year in patients with femoral shaft fractures treated with intramedullary nailing or skeletal traction in a low-income country: a prospective observational study of 187 patients in Malawi. Acta Orthop 2020; 91(6): 724-731. Damme W V, Leemput L V, Por I, Hardeman W, Meessen B. Out–of– pocket health expenditure and debt in poor households: evidence from Cambodia. Trop Med Int Health 2004; 9(2): 273-80. Diab M M, Shearer D, Kahn J G, Wu H H, Lau B C, Morshed S, Chokotho L. The cost of intramedullary nailing versus skeletal traction for treatment of femoral shaft fractures in Malawi: a prospective economic analysis. World J Surg 2019; 43(1): 87-95.

442

Eliezer E N, Haonga B T, Morshed S, Shearer D W. Predictors of reoperation for adult femoral shaft fractures managed operatively in a sub-Saharan country. J Bone Joint Surg Am 2017; 99(5): 388-95. Gosselin R A, Heitto M, Zirkle L. Cost-effectiveness of replacing skeletal traction by interlocked intramedullary nailing for femoral shaft fractures in a provincial trauma hospital in Cambodia. Int. Orthop 2009; 33(5): 1445-48. Haug L, Wazakili M, Young S, Van den Bergh G. Longstanding pain and social strain: patients’ and health care providers’ experiences with fracture management by skeletal traction; a qualitative study from Malawi. Disabil Rehabil 2017; 39(17): 1714-21. Jelsma J, Hansen K, De Weerdt W, De Cock P, Kind P. How do Zimbabweans value health states? Popul Health Metr 2003; 1(1): 11. Kamau D M, Gakuu L N, Gakuya E M, Sang E K. Comparison of closed femur fracture: skeletal traction and intramedullary nailing cost-effectiveness. East Afr Orthop J 2014; 8(1): 4-9. Kohler R E, Tomlinson J, Chilunjika T E, Young S, Hosseinipour M, Lee C N. “Life is at a standstill”: quality of life after lower extremity trauma in Malawi. Qual Life Res 2017; 26(4): 1027-35. Mkandawire N, Ngulube C, Lavy C. Orthopedic clinical officer program in Malawi: a model for providing orthopedic care. Clin Orthop Relat Res 2008; 466(10): 2385-91.

Acta Orthopaedica 2021; 92 (4): 436–442

National Statistical Office of Malawi. 2018 Population and housing census: main report. Zomba: National Statistical Office; 2019. Opondo E, Wanzala P, Makokha A. Cost effectiveness of using surgery versus skeletal traction in management of femoral shaft fractures at Thika level 5 hospital, Kenya. Pan Afr Med J 2013; 15: 42. Pro T. R2 TreeAge Software. Williamstown, MA; 2019 (accessed August 2020). Rabin R, Charro F D. EQ-5D: a measure of health status from the EuroQol Group. Ann Med 2001; 33(5): 337- 43. Stothers L. Cost-effectiveness analyses. In: Clinical research methods for surgeons (Eds. Penson D F, Wei J T). Totowa, NJ: Humana Press; 2006. p. 283-96. Sullivan S D, Mauskopf J A, Augustovski F, Caro J Augusto ski, Lee K M, Minchin M, Orlewska E, Penna P, Barrios J M R, Shau W Y. Budget impact analysis—principles of good practice: report of the ISPOR 2012 Budget Impact Analysis Good Practice II Task Force. Value Health 2014; 17(1): 5-14. World Bank 2019a. Available at: https://data.worldbank.org/indicator/ NY.GNP.PCAP.CD?locations = MW (accessed 12 July 2020). World Bank 2019b. Available at: https://data.worldbank.org/indicator/ NY.GNP.PCAP.PP.CD?locations = MW (accessed 30 July 2020). Zirkle L G, Shahab F. Interlocked intramedullary nail without fluoroscopy. Orthop Clin North Am 2016; 47(1): 57-66.

Acta Orthopaedica 2021; 92 (4): 443–447

443

Outcomes after arthroscopic revision surgery for anterior cruciate ligament injuries Alexei V YUMASHEV 1, Tatyana V BALTINA 2, and Dmitrii V BABASKIN 3 1 Department

of Prosthetic Dentistry, Sechenov First Moscow State Medical University (Sechenov University), Moscow; 2 Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan; 3 Department of Pharmacy, Sechenov First Moscow State Medical University (Sechenov University), Moscow, Russian Federation Correspondence: umaseva543@gmail.com Submitted 2020-11-03. Accepted 2021-01-19.

Background and purpose — The frequency of primary anterior cruciate ligament (ACL) reconstruction is increasing resulting in more ACL revision surgeries. Therefore, we assessed survival rates of 2 different grafts for ACL revision surgery at 1- and 5-year follow-ups, as well as physical activity levels of patients after revision surgery. Patients and methods — This is a retrospective cohort study involving 218 patients (176 males) who had revision surgery for anterior cruciate ligament injuries between 2008 and 2017 at the Clinic of Traumatology, Orthopedics and Joint Pathology Clinic (I.M. Sechenov First Moscow State Medical University). A comparison group involved 189 patients with only primary surgery. Surgical interventions were performed according to the standard procedure using bone–patellar tendon–bone (BTB) and semitendinosus/gracilis (ST/G) autografts. The results of revision surgery were assessed at 1- and 5-year follow-ups by using the Lysholm and International Knee Documentation Committee scores. Results — Malpositioned bone tunnels were found in 87/218 patients (40%). At 1 and 5 years postoperatively, the revision BTB group had significantly better results in terms of IKDC and Lysholm scores than the revision ST/G group (p = 0.03, Mann–Whitney U-test), and these results were comparable to those in the comparison group. Graft survival after revision was lower than after the primary operation. However, the survival rate of 80% is quite high and is consistent with previous findings. There were no statistically reliable differences in survival between ST/G and BTB autografts. Interpretation — The graft choice for revision ACL surgery should be decided upon before surgery based on, among other things, the state of bone tunnels, in particular their position and degree of bone resorption. Tunnel widening that exceeds 14 mm (osteolysis) would require 2-stage surgery using a BTB autograft with bone plugs because it is larger than the ST/G autograft.

Anterior cruciate ligament (ACL) repair has a steadily increasing success rate, varying between 75% and 95% (Lee et al. 2012, Shin et al. 2014). However, the need for revision surgery increases as the number of primary ACL reconstructions grows. Subjective and objective joint instability indicates graft failure and is the major indication for revision (Mohtadi et al. 2011, Magnussen et al. 2012). Revision surgery is more complex than primary reconstruction, since initially improper positioning of the bone tunnels complicates the creation of new tunnels and can entail incorrect graft choice (Shafizadeh et al. 2014). Thus, the success rate of surgery is also dependent on graft-fixation choice. Optimal graft choice and technique for revision ACL surgery remain an open debate. There is insufficient evidence for differences in long-term functional outcome between BTB (bone–patellar tendon–bone) and ST/G (semitendinosus/gracilis) grafts (Mohtadi et al. 2011). Despite preferences towards autografts (Ferretti et al. 2002, Gladilina al. 2018), some authors recommend using allografts as the least traumatic option (Bull et al. 2002). However, the choice of graft for revision surgery has been reported not to affect long-term outcomes (Ruiz et al. 2002). The interval between revision surgery and injury is another matter of debate. Early ACL revision may have a higher success rate; late interventions may result in degenerative joint disease (Shin et al. 2014). Degenerative changes in the cartilage and meniscal injuries occur with greater frequency during revision than during the primary ACL reconstruction (Stergios et al. 2012). This study aims to determine which graft is best for ACL revision surgery by looking at the survival rate of BTB and ST/G autografts and the physical activity levels of patients in the 1- and 5-year postoperative period. Knee outcomes are projected to be better after arthroscopic revision treatment with an autograft, but the autograft survival rate at revision

444

may be lower than after primary surgery. We also investigated the role of bone tunnel malposition in the re-emergence of knee instability.

Patients and methods Study design This study includes 218 patients who underwent ACL revision surgery between 2008 and 2017 at the Traumatology, Orthopedics and Joint Pathology Clinic, I.M. Sechenov First Moscow State Medical University (study group). There were 176 men and 42 women aged 19 to 36 years. The inclusion criterion was a recurrent knee instability after primary ACL repair, reported to occur less than a year before complaint. The reported injury mechanisms of recurrent knee instability were sports activities (71%) and household accidents (17%); 12% of patients denied injury, reporting that knee instability persisted after the primary ACL surgery. The primary ACL surgery was by an autogenous BTB autograft (44%), an ST/G autograft (38%), or a synthetic graft (18%). The comparison or control group involved 180 patients, including 135 males and 45 females aged 19–42 years, who underwent primary ACL surgeries within the same time period as the study group. In the control group, 68% of patients had suffered an ACL rupture due to sports injury and 32% of patients had ruptures due to household injuries. BTB autografts were used in 52% of patients, while the ST/G hamstring autografts were used to treat ACL injuries in 48% of patients. All revision surgeries were done by the same surgical team at the Traumatology, Orthopedics and Joint Pathology Clinic. Patients in the study group underwent primary surgery at different clinics, and patients in the control group were primary operated at Traumatology, Orthopedics and Joint Pathology Clinic. The surgical team performing primary surgery on patients in the control group was the same team that did revision surgeries among patients in the study group. For both the control and study groups, the exclusion criteria were patients with damage to the meniscus, and cartilage defects. Procedures To diagnose the ACL injury, 3 common tests were applied: the Lachman test, the anterior drawer test, and the pivotshift test. Knee arthrometry was performed using a KT-1000 knee arthrometer (MEDmetric Corp, San Diego, CA, USA). Results from the Lachman and anterior drawer tests were then compared with the KT-1000 measurements. The study included patients who had positive Lachman tests, 2+ anterior drawer tests or greater, more than 3 mm of displacement between healthy and injured joints on KT 1000 testing, and grade 2 pivot shift tests or greater (noticeable displacement, rough slide with a click). All revision patients underwent frontal and lateral radiography in the supine position to depict the size and position of the

Acta Orthopaedica 2021; 92 (4): 443–447

bone tunnels. In some cases, patients were assigned to receive computed tomography (CT) for a more accurate evaluation, which was crucial for good preoperative planning. The preoperative planning also involved three-dimensional (3D) reconstruction of the knee joint using CT images. Patients with no contraindications also underwent MRI scanning with a view to evaluating the strength and type of graft fixation. The tibial bone tunnel was classified as positioned correctly if, on the frontal radiograph, it formed an angle between 60°and 65° with respect to the medial joint line of the tibia; and, on the lateral radiograph, the tunnel was posterior and oriented parallel to the Blumensaat line. The femoral bone tunnel was classified as positioned correctly if it met the following criteria: in the ‘over-the-top position’; the tunnel was located 2 mm ventral to the posterior cortical layer of the femur. Revision arthroscopic knee procedure The choice of treatment procedure for the management of bone tunnels depended on the position thereof after primary surgery. When the previous tunnel did not interfere with new tunnel placement, it was left intact. In cases of tunnel overlap the hardware removal was mandatory, and the previous tunnels were filled with osteoplastic material. Bone defects that remain after hardware removal can be substantial, as in the case of osteolysis, and require filling with a bone plug. In cases of massive osteolysis (tunnel diameter of ≥ 14 mm) a 2-stage approach was used. The second stage of the revision took place after the 4–6-month mark when the previous tunnels filled with osteoplastic material showed the radiographic signs of consolidation. The 1-stage approach was used in other cases. The graft was chosen before surgery. The graft fixation type was dictated by the previous fixation type used and degree of bone resorption. In cases of massive osteolysis (tunnel diameter of ≥ 14 mm), the BTB autograft with bone plugs was used because it is larger as compared with ST/G autografts, while the ST/G autograft was employed only in cases with unexpanded tunnels. Of all patients who underwent ACL revision surgery, 43% received BTB autografts and 57% received ST/G autografts. Surgical intervention was performed under spinal anesthesia; the patient was placed on a standard operating table. Standard anteromedial and anterolateral arthroscopy portals were used. The surgical procedure was as follows. First, a graft choice was made. Then, the failed hardware (an old graft and fixation screws) was removed, and new tunnels were drilled in the femur and tibia using a standard technique. A new autograft was incorporated and fixed into the bone tunnels using the standard methods (i.e., Endobutton+interference screw fixation for ST/G autografts; Rigid-fix+interference screw fixation for BTB autografts). Graft tensioning was performed at full extension, as well as at 45° and 90° of flexion. A wound was closed in layers, with an active drainage system left in the joint cavity. All patients underwent thromboprophylaxis with enoxaparin sodim or dalteparin sodium.

Acta Orthopaedica 2021; 92 (4): 443–447

Postoperative regimen and recovery Active drainage was removed 24 to 36 hours after surgery. A rigid knee brace was applied and maintained for 3 weeks. Knee flexion was only allowed at rehabilitation visits 8 days after surgery, whilst the full range of motion was allowed at the end of knee brace usage. For the first 3 weeks after the operation patients were prohibited from inflicting load on their operated lower limbs. 8 days after the operation patients were allowed partial loading of the operated limb (15–20 kg), which increased to half weight-bearing 15 days after the surgery. All patients were allowed full weight-bearing and to walk without crutches after 2 weeks of rehabilitation. All patients in the comparison and experimental groups underwent a rehabilitation program, which included the use of a continuous passive motion machine, manual knee mobilization, lymphatic massage, electrical muscle stimulation, phonophoresis, and magnetic and laser therapy. Assessment The outcomes were measured using the Lysholm knee questionnaire and the International Knee Documentation Committee (IKDC) questionnaire. Revision surgery patients were asked to fill in the forms before revision and postoperatively, at 1-year and 5-year follow-ups. A year after the operation, the survey embraced all patients in the study and control group. At 5 years after revision surgery, the survey involved only patients operated on before the first half of 2014 because other patients had incomplete follow-up at the time. Among them were 112 patients in the revision surgery group, 54 with hamstring autografts and 58 with BTB autografts, and 94 patients in the primary surgery group, 43 with hamstring autografts and 51 with BTB autografts. Graft integrity after revision was assessed through MRI imaging at 6 months, at 1 year, and at 5 years postoperatively. For comparison purposes, a similar test was conducted in the control group. Data analysis Data were processed in Excel 2016 (Microsoft Corp, Redmond, WA, USA) and STAT1ST1CA 10.0 (Statsoft, Tulsa, OK, USA). Spearman’s rank correlation and the Mann–Whitney U-test were used. The null hypothesis regarding the normality of distribution was rejected at p < 0.05. Ethics, funding, and potential conflicts of interest The study was conducted in accordance with the ethical principles approved by the Ethics Committee of I.M. Sechenov First Moscow State Medical University (Protocol No. 4 of 22.03.2018) and in accordance with the World Medical Association Declaration of Helsinki. All the patients have given written informed consent. Tatyana Baltina was funded by the subsidy allocated to Kazan Federal University for the state assignment in the sphere of scientific activities, No 17.9783.2017/8.9. No conflicts of interest were declared.

445

Table 1. Postoperative knee function. Values are mean score (standard deviation) 1-year follow-up 5-year follow-up Revision group Primary Revision group Primary Scale BTB ST/G group BTB ST/G group IKDC 76 (5.4) 67 (9.1) 79 (7.3) 82 (9.2) 70 (11) Lysholm 79 (7.7) 71 (11) 81 (6.7) 87 (10) 72 (12)

85 (7.7) 89 (8.5)

BTB = bone–patellar tendon–bone autografts, ST/G = semitendinosus/gracilis hamstring autografts.

Results Bone tunnel positioning 1–2 bone tunnels were found to be malpositioned in 87/218 patients. Of these, 39% reported instability caused by injuries. The Spearman coefficient values indicated a relationship between instability and malposition of bone tunnels (р = 0.03). Baseline characteristics of study patients In the study group, Lysholm scores demonstrated good knee function in 14 patients (6%; 77–86 points), fair in 64 patients (29%; 67–76 points), and poor in 140 patients (64%; < 66 points). The mean score was 44 (SD 11). According to the IKDC scores, 11 patients (5%; 80–89 points) have nearly normal knee function, 62 patients have abnormal function of the knee (28%; 70–79 points), and 145 patients have a very abnormal knee function (67%; < 70 points). The mean score was 39 (12). Outcomes at 1 year and 5 years after the revision surgery Table 1 provides comparative surgical outcomes measured at 1 year and 5 years postoperatively. The Mann–Whitney U-test shows statistically significant differences in the 1-year postoperative knee function between patients who underwent revision with BTB and hamstring autografts (p = 0.04). The differences were also statistically significant between the group of patients who had revision with hamstring autografts and the primary surgery patients (p = 0.04). There was no statistically significant difference between the patients who underwent revision with BTB autografts and the primary surgery patients. The 5-year outcomes show a similar trend. Based on the results of the revision ACL surgery, BTB autografts performed better than ST/G autografts. The Lysholm and IKDC scores in patients who underwent revision with ST/G autografts did not improve significantly between the 1-year and 5-year follow-ups. The revision BTB group, on the other hand, demonstrated a statistically significant improvement (p = 0.04). In addition, the results of the revision ACL

446

Acta Orthopaedica 2021; 92 (4): 443–447

Table 2. Level of sports participation before and after revision Sports participation level (%)

Before revision n = 218

After revision 1 year 5 years n = 218 n = 112

Not involved 23 25 29 Amateur 15 38 53 Professional 62 37 18

surgery with BTB autografts were found to be comparable to those after the primary ACL surgery. There were no statistically significant differences between patients who underwent 1-stage or 2-stage surgeries. Sports and recreation activity The proportion of patients who did not participate in sports before injury did not change significantly but the proportion of professional players decreased in favor of amateur athletes, from 62% to 18% by the 5-year follow-up (Table 2). Thus, most patients after revision continued to play sports but a substantial number of them left professional sports and their level of sports participation changed. There were no statistically significant differences in sports participation between patients with BTB and ST/G autografts. Graft status Graft survival at 1 and 5 years (p = 0.04 and p = 0.04, respectively) after revision is lower than after the primary operation (Table 3). However, a survival rate of 80% is rather high and consistent with the results of other studies. There were no statistically significant differences in survival between ST/G and BTB autografts (p = 0.09).

Discussion As the number of primary anterior cruciate ligament operations grows, revision surgery becomes an increasing area of interest. The ACL revision procedure depends on several factors: the position of bone tunnels, degree of bone resorption, previous graft type, and fixation choice. The causes of graft failure include (Magnussen et al. 2012, Mariscalco et al. 2013): (1) preoperative: concomitant damage to the capsular-ligamentous knee apparatus (meniscus tear, cartilage defects); (2) intraoperative: improper graft choice, inadequate notch dimensions, improper positioning of bone tunnels, improper tensioning of graft, inadequate graft fixation; (3) postoperative: the lack of graft remodeling and revascularization, inadequate recovery program. According to many researchers, the majority of recurrent instability episodes that emerge after ACL reconstruction relate to malpositioning of the femoral tunnel (Paterno et al.

Table 3. Graft survival rates Graft rupture (%)

Revision group 6 months 1 year 5 years n = 218 n = 218 n = 112

Primary group 6 months 1 year 5 years n = 218 n = 218 n = 112

No 100 91 80 100 95 91 Partial 0 7 12 0 4 6 Complete 0 2 8 0 1 3

2014, Yasuda et al. 2016, Ochi et al. 2017). Some studies even distinguish the incorrect position of bone tunnels as a main reason for the postoperative recurrence of instability, which requires revision in 70–80% of cases (Mariscalco et al. 2013). This is corroborated by our findings, as we noted malposition of the bone tunnels in 40% undergoing revision surgery. Nowadays, there are 3 approaches to the creation of the femoral tunnel: transtibial technique, anteromedial technique, and retrograde technique. The latest studies (Rahardja et al. 2020), however, found no differences in the risk of revision between transtibial and anteromedial techniques at short-term followup. There was a slight difference in favor of the anteromedial portal technique detected a year after operation but the authors deemed it clinically insignificant, assuming that surgeons can achieve better results with any other method of tunnel creation (Rahardja et al. 2020). Much has been written about the challenge of tunnel enlargement after primary ACL reconstruction by the use of an ST/G hamstring graft (Iorio et al. 2013, Zhang et al. 2014, Weber et al. 2015). Favored for a less traumatic influence on the donor area, ST/G hamstring grafts have experienced a large wave of popularity growth. There is biomechanical evidence showing that hamstring grafts are stronger that BTB grafts (Schimoler et al. 2015, Stolarz et al. 2016). The widening of bone tunnels is driven by many factors such as the graft fixation technique, surgical approach, rehabilitation protocol, and the diversity of biological factors. The major challenge of using a hamstring tendon graft for ACL reconstruction is tendon–bone incorporation, as the biology of tendon graft–bone tunnel healing is incompletely understood (Chen 2009). Most authors agree that tunnel widening correlates indirectly with the clinical outcomes of the ACL reconstruction (Iorio et al. 2013, Weber et al. 2015). Yet, it poses a challenge during revision. During this study, the BTB autografts proved to be more effective regarding long-term IKDC and Lysholm scores, as compared with ST/G hamstring autografts. The survival rate of revision grafts is 80%, which is lower than that of primary grafts. This result is consistent with previous studies (Grassi et al. 2017, Mohan et al. 2018). We found no statistically significant differences in sports participation between patients having BTB and ST/G hamstring autografts. The return-to-sports rate following revision ACL reconstruction is lower than that after primary ACL surgery. A

Acta Orthopaedica 2021; 92 (4): 443–447

relatively high rate of return to sport at any level was reported in patients who underwent revision ACL reconstruction, but the rate of return to sport at pre-injury level was relatively low (Glogovac et al. 2019). We found that the majority of revision patients did not stop playing sports, but many switched from professional to amateur activities. This suggests that returning to the pre-injury level of physical activity after ACL revision may be a problem, but there is a good chance of retaining lower-level capabilities. In conclusion, our results confirm a high degree of bonetunnel malposition in patients undergoing revision ACL surgery due to recurrence of knee instability. BTB autografts outperformed ST/G autografts in the short and long term. The 5-year autograft survival after revision operation was around 80%, which was lower than that after primary intervention (~91%). The postoperative level of sports participation decreased slightly.

AY, TB, and DB contributed equally to the research. AY wrote and edited the article. TB conducted data analysis. DB studied scientific literature on the topic. All authors read and approved the final manuscript. Acta thanks Marco Castagnetti and Christer Rolf for help with peer review of this study.

Bull A M J, Earnshaw P H, Smith A, Katchburian M V, Hassan A N A, Amis A A. Intraoperative measurement of knee kinematics in reconstruction of the anterior cruciate ligament. J Bone Joint Surg 2002; 84B: 107-81. Chen C H. Strategies to enhance tendon graft–bone healing in anterior cruciate ligament reconstruction. Chang Gung Med J 2009; 32(5): 483-93. Ferretti A, Conteduca F, Morelli F, Masi V. Regeneration of the semitendinosus tendon after its use in anterior cruciate ligament reconstruction: a histologic study of three cases. Am J Sports Med 2002; 30(2): 204-7. Gladilina I P, Yumashev A V, Avdeeva T I, Fatkullina A A, Gafiyatullina E A. Psychological and pedagogical aspects of increasing the educational process efficiency in a university for specialists in the field of physical education and sport. Espacios 2018; 39(21): 11. Glogovac G, Schumaier A P, Grawe B M. Return to sport following revision anterior cruciate ligament reconstruction in athletes: a systematic review. Arthroscopy 2019; 35(7): 2222-30. Grassi A, Kim C, Marcheggiani Muccioli G M, Zaffagnini S, Amendola A. What is the mid-term failure rate of revision ACL reconstruction? A systematic review. Clin Orthop Relat Res 2017; 475(10): 2484-99. Iorio R, Di Sanzo V, Vadalà A, Conteduca J, Mazza D, Redler, Bolle A, Conteduca F, Ferretti A. ACL reconstruction with hamstrings: how different technique and fixation devices influence bone tunnel enlargement. Eur Rev Med Pharmacol Sci 2013; 17(21): 2956-61. Lee Y S, Sim J A, Kwak J H, Nam S W, Kim K H, Lee B K. Comparative analysis of femoral tunnels between outside-in and transtibial double

447

bundle anterior cruciate ligament reconstruction: a 3-dimensional computed tomography study. Arthroscopy 2012; 28: 1417-23. Magnussen R A, Lawrence J T, West R L, Toth A P, Taylor D C, Garrett W E. Graft size and patient age are predictors of early revision after anterior cruciate ligament reconstruction with hamstring autograft. Arthroscopy 2012; 28: 526-31. Mariscalco M W, Flanigan D C, Mitchell J, Pedroza A D, Jones M H, Andrish J T, Parker R D, Kaeding C C, Magnussen R A. The influence of hamstring autograft size on patient-reported outcomes and risk of revision after anterior cruciate ligament reconstruction: a Multicenter Orthopaedic Outcomes Network (MOON) cohort study. Arthroscopy 2013; 29: 1948-53. Mohan R, Webster K E, Johnson N R, Stuart M J, Hewett T E, Krych A J. Clinical outcomes in revision anterior cruciate ligament reconstruction: a meta-analysis. Arthroscopy 2018; 34(1): 289-300. Mohtadi N G, Chan DS , Dainty K N, Whelan D B. Patellar tendon versus hamstring tendon autograft for anterior cruciate ligament rupture in adults. Cochrane Database Syst Rev 2011; 9: 3-5. Ochi M, Dejour D, Nakamae A, Ntagiopoulos P G. Diagnosis of partial ACL rupture. In: Nakamura N, Zaffagnini S, Marx RG, Musahl V. Controversies in the technical aspects of ACL reconstruction. Berlin, Heidelberg: Springer; 2017. Paterno M V, Rauh M J, Schmitt L C, Ford K R, Hewett T E. Incidence of second ACL injuries 2 years after primary ACL reconstruction and return to sport. Am J Sports Med 2014; 42(7): 1567-73. Rahardja R, Zhu M, Love H, Clatworthy M G, Monk A P, Young S W. No difference in revision rates between anteromedial portal and transtibial drilling of the femoral graft tunnel in primary anterior cruciate ligament reconstruction: early results from the New Zealand ACL Registry. Knee Surg Sports Traumatol Arthrosc 2020; 28: 3631-8. Ruiz A L, Kelly M, Nutton R W. Arthroscopic ACL reconstruction: a 5–9year follow-up. Knee 2002; 9: 197-200. Schimoler P J, Braun D T, Miller M C, Akhavan S. Quadrupled hamstring graft strength as a function of clinical sizing. Arthroscopy 2015; 31(6): 1091-6. Shafizadeh S, Balke M, Hagn U, Hoeher J, Banerjee M. Variability of tunnel positioning in ACL reconstruction. Arch Orthop Trauma Surg 2014; 134(10): 1429-36. Shin Y S, Ro K H, Jeon J H, Lee D H. Graft-bending angle and femoral tunnel length after single-bundle anterior cruciate ligament reconstruction: comparison of the transtibial, anteromedial portal and outside-in techniques. Bone Joint J 2014; 96B(6): 743-51. Stergios P, Georgios K, Konstatntinos N, Efthymia P, Nikolaos K, Alexandros P G. Adequacy of semitendinosus tendon alone for anterior cruciate ligament reconstruction graft and prediction of hamstring graft size by evaluating simple anthropometric parameters. Anat Res Int 2012; 1: 424158. Stolarz M, Ficek K, Binkowski M, Wróbel Z. Bone tunnel enlargement following hamstring anterior cruciate ligament reconstruction: a comprehensive review. Phys Sportsmed 2016; 45(1): 31-40. Weber A E, Delos D, Oltean H N, Vadasdi K, Cavanaugh J, Potter H G, Rodeo S A. Tibial and femoral tunnel changes after ACL reconstruction: a prospective 2-year longitudinal MRI study. Am J Sports Med 2015; 43(5): 1147-56. Yasuda K, Kondo E, Kitamura N. Anatomic double-bundle reconstruction procedure. In: Ochi M, Shino K, Yasuda K, Kurosaka M. ACL injury and its treatment. Tokyo: Springer; 2016. Zhang Q, Zhang S, Cao X, Liu L, Li R. The effect of remnant preservation on tibial tunnel enlargement in ACL reconstruction with hamstring autograft: a prospective randomized controlled trial. Knee Surg Sports Traumatol Arthrosc 2014; 22(1): 166-73.

448

Acta Orthopaedica 2021; 92 (4): 448–451

A projection of primary knee replacement in Denmark from 2020 to 2050 Louise DAUGBERG 1, Thomas JAKOBSEN 1,2, Poul Torben NIELSEN 2, Mathias RASMUSSEN 2, and Anders EL-GALALY 2 1 Department of Clinical Medicine, Aalborg University; 2 Interdisciplinary Orthopaedics, Aalborg University Hospital, Denmark Correspondence: LD: Louise.Holm@rn.dk Submitted 2020-09-22. Accepted 2021-02-09.

Background and purpose — The incidence of knee replacements (KRs) has increased in the past decades. Previous studies have forecast a continuous and almost exponential rise in the use of KRs, but this rise must cease at some point. We estimated when and at what incidence the use of KRs will plateau in Denmark. Patients and methods — We retrieved 138,223 primary KRs conducted from 1997 to 2019 from the Danish Knee Arthroplasty Registry. Censuses from 1997 to 2019 as well as population projections from 2020 through 2050 were collected from Statistics Denmark. We applied logistic and Gompertz regression analysis to the data to estimate the future incidence until 2050 with root mean squared error (RMSE) as a quantitative measurement of the models’ fit. Results — The Danish incidence of KRs from 1997 to 2009 increased by more than 300%, but has stalled since 2009. Logistic and Gompertz regression had an RMSE of 14 and 15 indicating that these models fitted the data well. Logistic and Gompertz regressions estimated that the maximum incidence will be reached in 2030 at 250 (95% prediction interval [PI]) 159–316) KRs per 105 or in 2035 at 260 (PI 182–336) KRs per 105, respectively. Interpretation — The Danish incidence of KRs seems set to plateau within the coming decades. Countries experiencing a current exponential rise at a lower incidence may benefit from this study’s projection when forecasting their future demand for KRs.

Only few studies have attempted to project the future demand for knee replacements (KRs) (Kurtz et al. 2007, Culliford et al. 2015, Patel et al. 2015, Guerrero-Ludueña et al. 2016) with only 1 conducted in a recent Swedish population (Nemes et al. 2015). Most of these studies are based on historical data from a period when the countries experienced a rapid, almost exponential, growth in the incidence of KRs. Historically Denmark has experienced a similar growth, but within the last decade the increase in incidence has stalled. In all countries, a similar stagnation is to be expected. Yet, when a country is experiencing a rapid increase in the incidence of KRs it is difficult to reliably estimate at which timepoint and volume the incidence will stagnate. Therefore, we used the stagnating incidence in Denmark to make a more reliable estimation of when and at what volume the incidence of primary KRs will plateau during 2020 to 2050.

Patients and methods Study design This is a cross-sectional study based on data from primary KRs conducted in Denmark from 1997 through 2019 and presented in accordance with the STROBE Statement. Data source The study was based on the Danish Knee Arthroplasty Registry (DKR), which has monitored KRs conducted in Denmark since 1997 (Pedersen et al. 2012). Within the first decade (i.e., 1997–2007) the completeness of primary KRs in the DKR rose steadily and since 2007 the completeness has been above 90%. The DKR is linked to the Danish National Patient Registry (DNPR), which is the major administrative database in Denmark collecting information on all medical treatments

Acta Orthopaedica 2021; 92 (4): 448–451

Primary knee arthroplasties in the Danish Knee Arthroplasty Registry 01.01.1997–31.12.2019 n = 141,085 Excluded (n = 2,862): – age < 30, 115 – age > 99, 3 – duplicates, 1,858 – condylar implants, 170 – undefined arthoplasties, 716 Included n = 138,223

449

Predicted logistic incidence per 105 citizens 250

250

200

150

100

actual incidence predicted incidence 95% confidence interval

Figure 1. Inclusion and exclusions of study cohort.

Predicted Gompertz incidence per 105 citizens

2000

2010

2020

2030

2040

2050

2000

2010

2020

2030

2040

2050

Figure 2. Logistic and Gompertz regression analysis of primary KR incidence per 105 citizens based on data from the DKR between 1997 and 2019. The red line is the actual incidence of primary KRs per 105 individuals from 1997 to 2019, while the blue line is the regression from 1997 to 2050. The green areas signify the 95% confidence intervals.

conducted at public and private hospitals since 1968 (Schmidt et al. 2015). This linkage was used to adjust for unregistered primary KRs in the DKR, and thus the total number of primary KRs was in accordance with the DNPR, whereas the reported subtypes of KRs (e.g., total knee arthroplasties or unicompartmental knee arthroplasties) were based solely on the registered KRs in the DKR. Censuses from 1997 to 2019 and population projections from 2020 through 2050 were collected from Statistics Denmark. Statistics Denmark is the central Danish authority of collecting, processing, and publishing statistical information e.g., censuses in Denmark (Statistics Denmark; https://www. dst.dk/en). The data used in this study was collected on March 9, 2020. Study cohort From the DKR, we retrieved information on all primary KRs conducted from 1997 to 2019. Individuals younger than 30 or older than 99 were excluded as they do not represent the typical patient undergoing KR. After exclusion of these as well as duplicates, undefined implants, and condylar implants (e.g., hemicaps), 138,223 primary KRs were included in the study cohort (Figure 1). Data analyses and modelling The KRs were divided by type of arthroplasty to see the evolution in the use of different KRs from 1997 to 2019. The censuses from Statistics Denmark were used to determine the incidence per 105 of primary KRs from 1997 through 2019. From the incidence and population projections, we estimated the annual number of primary KRs to be conducted in Denmark from 2020 through 2050. The annual incidence of KRs was regressed on each calendar year from 1997 throughout 2050 with the use of logistic and Gompertz regression analysis on the actual incidences

between 1997 and 2019. Logistic regression assumes that the quantity of knee replacements increases in a similar fashion to an exponential curve but gradually slows to linear growth. Gompertz regression analysis assumes that the quantity of knee replacements increases similarly to the logistic model, but the upper asymptote is approached more gradually than in the logistic, where the curve is symmetric. Statistics Root mean squared error (RMSE) was used as a quality estimator of the models’ fit to the data points and used to pick the best-fitted models. All estimates were rounded to the nearest hundreds or thousands (when the projected numbers exceeded 1.000), and presented with their 95% prediction interval (PI). All analyses were conducted in JMP Pro 15 by SAS (SAS Institute, Cary, NC, USA). Ethics, funding, data sharing, and potential conflicts of interests This study was approved by the North Denmark Region (ID: 2019-107) and by the Steering Committee of the Danish Knee Arthroplasty Registry (DKR-2019-10-16). The study was financed by Interdisciplinary Orthopedics at Aalborg University Hospital. All the data used in this study can be retrieved from the Danish Knee Arthroplasty Registry and Statistics Denmark. All data retrieved from these sources and used in this study can be seen in Table 1. None of the authors report any conflicts of interest.

Results The annual number of primary KRs conducted in Denmark has increased exponentially, from 2,003 in 1997 to 7,651 in 2008. Since 2008, the increase has gradually stalled and in 2019, 10,184 primary knee replacements were inserted in Denmark

450

Acta Orthopaedica 2021; 92 (4): 448–451

Table 1. Baseline data and characteristics Absolute Incidence Population Primary KR Partial Complete- number of per 105 Year age 30–99 in the DKR KR ness (%) primary KR Danes 1997 3,274,026 1,386 29 69 2,003 1998 3,302,439 1,869 40 83 2,242 1999 3,325,149 1,816 46 77 2,373 2000 3,343,867 2,235 92 89 2,522 2001 3,363,911 2,690 98 87 3,105 2002 3,386,844 3,599 220 83 4,339 2003 3,409,437 3,940 333 87 4,549 2004 3,428,946 4,209 381 84 5,009 2005 3,449,794 4,686 414 83 5,616 2006 3,471,898 5,445 479 88 6,204 2007 3,489,318 7,003 594 93 7,504 2008 3,507,360 6,987 707 91 7,651 2009 3,528,810 8,219 788 97 8,430 2010 3,544,098 8,309 901 96 8,670 2011 3,559,168 8,030 888 97 8,278 2012 3,570,344 8,042 784 97 8,299 2013 3,583,952 7,994 806 97 8,209 2014 3,598,842 8,240 897 98 8,394 2015 3,619,744 8,180 1,214 98 8,307 2016 3,646,223 8,067 1,462 98 8,264 2017 3,671,269 8,159 1,727 97 8,384 2018 3,695,198 9,240 1,823 97 9,535 a 2019 3,719,214 9,878 2,263 97 10,184

61 68 71 75 92 128 133 146 163 179 215 218 239 245 233 232 229 233 229 227 228 258 274

an acceptable fit and thus these models were used for forecasting. The regressions forecast a maximal incidence at 250 (PI 159–316) per 105 in 2030 (logistic regression) or 260 (PI 182– 336) per 105 in 2035 (Gompertz regression) (Table 2). However, as the population grows the annual number of primary KRs continues to increase until 2050 (Table 2) when between 10,379 (logistic regression) and 10,808 (Gompertz regression) are expected.

Discussion

We found that the incidence of primary knee replacement will plateau within the next 10–15 years with an expected maximal annual incidence between 250 and 260 per 105 Danes. Comparable studies from the USA, United Kingdom, Spain, Australia, Germany, and Sweden have either found no maximal incidence within their projection, a higher maximal incidence per 105 citizens than presented in our study, a slowing incidence rate (Bini et al. KR, knee replacement 2011), or the nearing of what seems to be a maxDKR, the Danish Knee Arthroplasty Registry a Assumed completeness due to a lack of data. imal incidence. The higher maximal incidences in the aforementioned studies might be due to wider inclusion of KRs, such as revisions, or Table 2. Projection results based on logistic and Gompertz regression of the pridifficulties estimating the maximal incidence at mary KR incidence per 105 Danes between 1997 and 2050 the time of an exponential or linear rising incidence (Kurtz et al. 2007, Culliford et al. 2015, Logistic regression Gompertz regression Nemes et al. 2015, Patel et al. 2015, Guerrero Predicted Predicted Ludueña et al. 2016, Inacio et al. 2017a, 2017b, Population number of number of Rupp et al. 2020). It must be noted that a plateau age 30–99 Incidence (PI) primary KRs Incidence (PI) primary KRs in the annual number of primary KRs seems to 2000 3,343,867 86 (79–239) 2,878 90 (80–236) 3,009 be ensuing in the study by Guerrero-Ludueña et 2005 3,449,794 172 (93–252) 5,947 172 (95–250) 5,937 al. (2016); however, data from the private sector 2010 3,544,098 226 (106–264) 8,004 221 (109–264) 7,850 2015 3,619,744 244 (120–277) 8,819 244 (124–278) 8,844 was not included, therefore the forming of a pla2020 3,745,142 248 (133–290) 9,298 254 (139–293) 9,506 teau can neither be confirmed nor denied. 2025 3,881,337 249 (146–303) 9,679 258 (153–307) 9,999 Niemeläinen et al. (2017) investigated the dif2030 3,990,028 250 (159–316) 9,961 259 (168–322) 10,339 2035 4,061,503 250 (172–329) 10,142 260 (182–336) 10,548 ferent incidences of knee arthroplasty in Den2040 4,116,730 250 (184–343) 10,280 260 (196–351) 10,700 mark, Sweden, Norway, and Finland from 1997 2045 4,135,043 250 (197–356) 10,326 260 (210–366) 10,752 through 2012. They found that the incidences of 2050 4,156,132 250 (209–370) 10,379 260 (223–381) 10,808 primary TKA and UKA per 10,000 inhabitants The estimations of annual numbers were based on the incidence regressions along over the age of 30 increased in all 4 countries: with the predicted population size in the given year. Denmark (3.4–21), Sweden (9.0–21), Norway (3.6–14), and Finland (13–28). However, they (Table 1). The mean patient age at knee replacement surgery assumed a 10–15% underestimation of the Danish data due to in 2019 was 68 years (30–99) and 57% were females. The pro- lower completion of the DKR in the first 10 years of the study portion of unicompartmental knee arthroplasties (UKAs) has period. At the end of their study period, Danish and Finnish increased within the past 2 decades and currently constitutes incidences seemed more stable in nature than the Swedish and 22% of primary knee replacements in Denmark (Table 1). Norwegian incidences, which still showed signs of linear to The logistic and Gompertz regression analysis (Figure 2) exponential growth. This is the start of the bend in the Danish had an RMSE of 13.9 and 15.5 respectively, which represent incidence curve observed from 2010 through 2017. This had

Acta Orthopaedica 2021; 92 (4): 448–451

not been observed in other countries at the time of forecasting, which might be the reason why this study found that the projected incidence of primary KR in Denmark will reach a plateau sooner than the previously reported forecasts from comparable countries. Naturally, this bend in the Danish incidence curve from 2010 through 2017 could constitute a halt in growth, which would be replaced by a continuous future rise. However, the duration of the stable incidence counters that it should be outlier-years. The soon-to-be reached Danish maximal incidence might in part be due to the Danish tax-paid healthcare system, in which patients’ personal economy is not a limiting factor for healthcare accessibility. In practice, this means that every person assessed to require a KR will be offered one. Additionally, the required sick leave for the surgery and rehabilitation are paid for by the Danish Healthcare System, as are many other additional needs. The high accessibility and equality in Danish healthcare also ensures the high external validity of our study (Schmidt et al. 2019). The annual number of Danes needing a KR is comparable to the annual number of Danes receiving a KR. This results in higher incidences (the Danish KR incidence was the 10th highest among the OECD countries in 2017 (Organization for Economic Cooperation and Development 2019) compared with countries thay have more privatized healthcare or less equal healthcare, since some individuals will not be able to afford the surgery or unpaid sick leave. This leaves a group of unknown size without representation in future projections. Another reason for higher incidences in Denmark might also be a result of patient preferences, since hindrances to surgery are few in Denmark. For example, the waiting time for KR in Denmark was the lowest among the OECD countries in 2017 (Organization for Economic Co-operation and Development 2019). This in essence means that the incidence of KR in Denmark might be closer to the actual need for KRs compared with countries with privatized healthcare. Our study has some limitations. 1st, projections are sensitive to changes in the population, treatment protocols, or trends in treatment. Trends such as surgeons relying on partial knee replacements, e.g., medial unicompartmental arthroplasty to a higher degree than earlier (see Table 1) or changes in treatment protocol towards more or less conservative treatment, are also liable to affect future incidences. 2nd, several implants were excluded due to duplication, a lack of description, etc. before an adjustment with data from the NDPR was made, in order to create realistic estimates of the annual burden of KR. This might have resulted in less precise regressions; however, the difference would be minimal. 3rd, we were unable to retrieve projections of the evolution of known risk factors for OA such as BMI. Countries with either a higher mean BMI or a more rapid increase in their mean BMI might experience a more rapid increase in the need for KRs or higher maximum incidence than we found.

451

In conclusion, the incidence in Denmark of primary knee replacements seems to be nearing its plateau, but in spite of this the absolute number of primary KRs will continue to increase as the population gets older. The Danish healthcare system ought to prepare for an increase in primary knee replacements as well as revisions in the future. All authors were involved in planning the study. LD and AEG retrieved and analyzed the data. LD and TJ wrote the initial manuscript, and all authors accepted the final manuscript before submission. Acta thanks Szilard Nemes and Marieke Ostendorf for help with peer review of this study.

Bini S A, Sidney S, Sorel M. Slowing demand for total joint arthroplasty in a population of 3.2 million. J Arthroplasty 2011; 26(Suppl. 6): 124-8. Culliford D, Maskell J, Judge A, Cooper C, Prieto-Alhambra D, Arden N K. Future projections of total hip and knee arthroplasty in the UK: results from the UK Clinical Practice Research Datalink. Osteoarthritis Cartilage 2015; 23(4): 594-600. Guerrero-Ludueña R E, Comas M, Espallargues M, Coll M, Pons M, Sabatés S, Allepuz A, Castells X. Predicting the burden of revision knee arthroplasty: simulation of a 20-year horizon. Value Heal 2016; 19(5): 680-7. Inacio M C S, Graves S E, Pratt N L, Roughead E E, Nemes S. Increase in total joint arthroplasty projected from 2014 to 2046 in Australia: a conservative local model with international implications. Clin Orthop Relat Res 2017a; 475(8): 2130-7. Inacio M C S, Paxton E W, Graves S E, Namba R S, Nemes S. Projected increase in total knee arthroplasty in the United States: an alternative projection model. Osteoarthritis Cartilage 2017b; 25(11): 1797-803. Kurtz S, Ong K, Lau E, Mowat F, Halpern M. Projections of primary and revision hip and knee arthroplasty in the United States from 2005 to 2030. J Bone Joint Surg Ser A 2007; 89(4): 780-5. Nemes S, Rolfson O, W-Dahl A, Garellick G, Sundberg M, Kärrholm J, Robertsson O. Historical view and future demand for knee arthroplasty in Sweden. Acta Orthop 2015; 86(4): 426-31. Niemeläinen M J, Mäkelä K T, Robertsson O, W-Dahl A, Furnes O, Fenstad A M, Pedersen A B, Schrøder H M, Huhtala H, Eskelinen A. Different incidences of knee arthroplasty in the Nordic countries: a population-based study from the Nordic Arthroplasty Register Association. Acta Orthop 2017; 88(2): 173-8. Organization for Economic Co-operation and Development. Health at a Glance 2019. Paris: OECD; 2019. Patel A, Pavlou G, Mújica-Mota R E, Toms A D. The epidemiology of revision total knee and hip arthroplasty in England and Wales. Bone Joint J 2015; 97-B(8): 1076-81. Pedersen A B, Mehnert F, Odgaard A S H. Existing data sources for clinical epidemiology: the Danish Knee Arthroplasty Register. Clin Epidemiol 2012; 4:303-13. Rupp M, Lau E, Kurtz S M, Alt V. Projections of primary TKA and THA in Germany from 2016 through 2040. Clin Orthop Relat Res 2020; 478(7): 1622-33. Schmidt M, Schmidt S A J, Sandegaard J L, Ehrenstein V, Pedersen L, Sørensen H T. The Danish National Patient Registry: a review of content, data quality, and research potential. Clin Epidemiol 2015; 7: 449-90. Schmidt M, Schmidt S A J, Adelborg K, Sundbøll J, Laugesen K, Ehrenstein V, Sørensen H T. The Danish health care system and epidemiological research: from health care contacts to database records. Clin Epidemiol 2019: 563-91. Statistics D. Statistics Denmark. Available at: https://www.dst.dk/en

452

Acta Orthopaedica 2021; 92 (4): 452–454

Technical note 1-stage total knee arthroplasty and proximal tibial non-union correction using 3-D planning and custom-made cutting guide Andreas KAPPEL 1,2, Poul Torben NIELSEN 1, and Søren KOLD 1,2 1 Interdisciplinary Orthopaedics, Aalborg University Hospital; 2 Department of Clinical Medicine, Aalborg University, Denmark Correspondence: andreas.kappel@rn.dk Submitted 2021-01-30. Accepted 2021-02-15.

Primary total knee arthroplasty (TKA) in the presence of extra-articular deformity and non-union is challenging (Papagelopoulos et al. 2007, Sculco et al. 2019). Bony correction can be staged prior to TKA surgery or simultaneously as a 1-stage procedure. 1-stage procedures are well described in cases with tibial deformity (Wang and Wang 2002, Xiao-Gang et al. 2012, Catonné et al. 2019) and have also been described for proximal tibial non-union (Moskal and Mann 2001, Papagelopoulos et al. 2007). We present a new technique in which 3-dimensional (3D) planning and a custom-made cutting guide were used in a 1-stage procedure treating a complex case with posttraumatic knee osteoarthritis, proximal non-union, and deformity.

Surgical planning and technique The planning was done in cooperation with engineers from Materialise (Materialise, Leuven, Belgium). CT DICOM files were uploaded to the planning software in which resection of the non-union could be adjusted to establish optimal healthy bone contact following wedge closure. Deformity correction was planned to meet the anatomy of a statistically average model of the tibia. The program allowed visualization of resections and of the intended postoperative results with the tibial implant inserted (Figures 2 and 3). A personalized cutting guide was designed to guide the resection; fit of this guide was considered essential and as minor hardware was to be removed during surgery, screwholes from this hardware served as reference for guide placement (see Figure 3). Surgery was performed through a standard midline incision, which was extended distally and medially to join a previous incision, a medial parapatellar arthrotomy was used, and the

Patient A 67-year-old and otherwise healthy male (BMI = 29), with sequala from a complex proximal left tibial fracture (AO type 41-C3), primarily treated, 2 years previously, with a combination of internal and external fixation. Complaints were knee pain, valgus deformity, and extension deficit. ROM was from 15° extension deficit to 120° of flexion. Preoperative radiographs and CT scan showed severe lateral joint incongruency, and proximal tibial non-union, valgus deformity, and a failed osteosynthesis with a broken plate (Figure 1). Treatment options included staged surgery with deformity correction and treatment of the non-union using external fixation as the first stage. Due to the joint incongruency and patient unwillingness for external fixation, we chose bony correction with closed wedge resection of non-union site Figure 1. Preoperative CT and radiographs showing posttraumatic lateral tibial joint incongruence and and TKA in a 1-stage approach. non-union. © 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group, on behalf of the Nordic Orthopedic Federation. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. DOI 10.1080/17453674.2021.1894789

Acta Orthopaedica 2021; 92 (4): 452–454

Figure 2. Surgical planning: red areas are planned resections, including resection of non-union site. Left: coronal view. Mid: sagittal view illustrating biplanar cuts. Right: coronal view with personalized cutting guide (Materialise planning software).

453

medial tibia was stripped from soft tissue to allow guide placement. Existing hardware was removed, and the cutting guide temporarily fixed to the bone with K-wires. Resection was performed with a reciprocating saw under fluoroscopy. Prior to closing the wedge, the resected non-union site was grafted with autologous bone graft from femoral bone cuts. TKA insertion was uneventful, the tibial component was cemented on the proximal surface and uncemented distally, and the medullary canal was reamed to obtain stability of the stem. The postoperative protocol was negative pressure wound therapy (2 weeks), mild compression bandage (6 weeks), and immediate partial weightbearing (6 weeks). Radiographs at day 1, 6 weeks, 4 months, and 1 year showed correction of the deformity with satisfying placement of the implant and bony healing (Figure 4). At 1-year follow-up there was an extension deficit of 5°; flexion was 115° Goals of pain reduction and functional improvement were met. Ethics, funding, and potential conflict of interests Informed consent to publish the case was obtained from the patient. No external funding was obtained. The authors have no conflicts of interest to declare.

Discussion Surgical planning from a PACS viewer with inspection of 2D radiographs and 3D reconstructions was challenging in this complex case and we found the 3D planning software helpFigure 3. Surgical planning. Left: comparison of deformity to statistically average tibia. Right: planned correction and implant position (Materiful in this process. The use of a custom guide during surgery alise planning software). required extended soft tissue stripping from the medial tibia; however, the guide facilitated uncomplicated resection. If resection were to have been guided by K-wires inserted under fluoroscopy we believe that only a simpler resection would have been feasible; this would have compromised the bony contact area and healing following wedge closure. Bone union following nonunion correction is dependent on stable fixation, mechanical alignment, and early functional rehabilitation (Ferreira and Marais 2015, Windolf et al. 2020). In the technique described the tibial implant is fixed to the proximal segment with surface cementing and to the distal segment with an uncemented splined stem. StabilFigure 4. Postoperatively, planned correction and implant position obtained. Left: day 1. Right 12 months. ity of the construct is enhanced by bony stability obtained from the

454

exact biplanar resection, and there is no need for other fixation of the osteotomy with plates or screws as described by previous authors (Moskal and Mann 2001). The advantages of using these new techniques in complex cases are supported by previous reviews (Tack et al. 2016, Ejnisman et al. 2021).

Preoperative planning of the surgery: all authors. Responsible surgeon: AK. Primary draft of manuscript: AK. Critical revision of work and final approval of manuscript: all authors. Acta thanks Kaj Knutson for help with peer review of this study.

Catonné Y, Sariali E, Khiami F, Rouvillain J L, Wajsfisz A, Pascal-Moussellard H. Same-stage total knee arthroplasty and osteotomy for osteoarthritis with extra-articular deformity. Part I: Tibial osteotomy, prospective study of 26 cases. Orthop Traumatol Surg Res 2019; 105(6): 1047-54. Ejnisman L, Gobbato B, Camargo A F de F, Zancul E. Three-dimensional printing in orthopedics: from the basics to surgical applications. Curr Rev

Acta Orthopaedica 2021; 92 (4): 452–454

Musculoskelet Med 2021 Jan 7. Online ahead of print. doi 101007/s12178020-09691-3. Ferreira N, Marais L C. Management of tibial non-unions according to a novel treatment algorithm. Injury 2015; 46(12): 2422-7. Moskal J T, Mann J W. Simultaneous management of ipsilateral gonarthritis and ununited tibial stress fracture: combined total knee arthroplasty and internal fixation. J Arthroplasty 2001; 16(4): 506-11. Papagelopoulos P, Karachalios T, Themistocleous G, Papadopoulos E, Savvidou O, Rand J. Total knee arthroplasty in patients with pre-existing fracture deformity. Orthopedics 2007; 30(5): 373-8. Sculco P K, Kahlenberg C A, Fragomen A T, Rozbruch S R. Management of extra-articular deformity in the setting of total knee arthroplasty. J Am Acad Orthop Surg 2019; 27(18): e819-30. Tack P, Victor J, Gemmel P, Annemans L. 3D-printing techniques in a medical setting: a systematic literature review. Biomed Eng Online 2016; 15(1): 1–21. Wang J-W, Wang C-J. Total knee arthroplasty for arthritis of the knee with extra-articular deformity. J Bone Joint Surg Am 2002; 84(10): 1769-74. Windolf M, Ernst M, Schwyn R, Arens D, Zeiter S. The relation between fracture activity and bone healing with special reference to the early healing phase: a preclinical study. Injury 2020; 52: 71-7. Xiao-Gang Z, Shahzad K, Li C. One-stage total knee arthroplasty for patients with osteoarthritis of the knee and extra-articular deformity. Int Orthop 2012; 36(12): 2457-63.

Acta Orthopaedica 2021; 92 (4): 455–460

455

Antiresorptive treatment and talar collapse after displaced fractures of the talar neck: a long-term follow-up of 19 patients Andreas MEUNIER 1, Lars PALM 1, Per ASPENBERG 1*, and Jörg SCHILCHER 1,2 1 Department

of Orthopedics and Department of Biomedical and Clinical Sciences, Faculty of Health Science, Linköping University, Linköping; 2 Wallenberg Centre for Molecular Medicine, Linköping University, Linköping, Sweden *Author deceased Correspondence: andreas.meunier@regionostergotland.se Submitted 2020-11-02. Accepted 2021-03-08.

Background and purpose — Displaced fractures of the talar neck are associated with a high risk of structural collapse. In this observational analysis we hypothesized that pharmacological inhibition of osteoclast function might reduce the risk of structural collapse through a reduction in bone resorption during revascularization of the injured bone. Patients and methods — Between 2002 and 2014 we treated 19 patients with displaced fractures of the talar neck with open reduction and internal fixation. Of these, 16 patients were available for final follow-up between January and November 2017 (median 12 years, IQR 7–13). Among these, 6 patients with Hawkins type 3 fractures and 2 patients with Hawkins type 2b fractures received postoperative antiresorptive treatment (7 alendronate, 1 denosumab) for 6 to 12 months. The remaining 8 patients received no antiresorptive treatment. The self-reported foot and ankle score (SEFAS) was available in all patients and 15 patients had undergone computed tomography (CT) at final follow-up, which allowed evaluation of structural collapse of the talar dome and signs of post-traumatic osteoarthritis. Results — The risk for partial collapse of the talar dome was equal in the 2 groups (3 in each group) and post-traumatic arthritis was observed in all patients. The SEFAS in patients with antiresorptive treatment was lower, at 21 points (95% CI 15–26), compared with those without treatment, 29 points (CI 22–35). Interpretation — Following a displaced fracture of the talar neck, we found no effect of antiresorptive therapy on the rate of talar collapse, post-traumatic osteoarthritis, and patient-reported outcomes.

Talar neck fractures are often the result of a high-energy trauma and are associated with a high risk of complications (Dodd and Lefaivre 2015). Given the intraarticular location of the talus and its extensive cartilage coverage, its blood supply is vulnerable. Traumatic disruption of the blood supply during fracture of the talar neck leaves the talar dome at a high risk of avascular necrosis (AVN). The risk of AVN seems to be dependent on the presence of subtalar joint dislocation. If the joint is dislocated (type 2b and more severe types) the risk of AVN is 25–50%. No AVN is seen in cases of minor joint displacement (type 2a) (Vallier et al. 2014). Interruption of the blood supply to the talar dome impairs the fracture healing capacity and bone remodeling. Subsequently, with revascularization of avascular areas, damaged osteocytes trigger the formation of osteoclasts (Glimcher and Kenzora 1979). These osteoclasts resorb the dead bone, and osteoblasts deposit new bone material (Hofstaetter et al. 2006). If the rate of bone resorption exceeds the rate of new bone formation the mechanical properties of the talar dome will deteriorate and may not be sufficient to withstand the forces transmitted during weight-bearing. The resulting collapse of the talus is associated with severe loss of function, limited range of motion and pain (Annappa et al. 2015). Bisphosphonates reduce osteoclast function and thereby inhibit bone resorption. These agents have been proven to be successful in the treatment of several bone metabolic disorders, such as osteoporosis, osteogenesis imperfecta, Paget’s disease of bone, and metastatic bone disease. In addition, in the treatment of AVN of the femoral head, bisphosphonate treatment is associated with a reduction in the rate of structural collapse (Luo et al. 2014). Considering the detrimental effects of post-traumatic AVN of the talus, and the low-risk profile of short-term antiresorp-

456

Acta Orthopaedica 2021; 92 (4): 455–460

Surgically treated fractures of the talus 2002–2014 n = 40 Excluded (n = 21): – Hawkins type 1, 2 – Hawkins type 2a, 7 – no neck fracture, 12 Displaced fractures of the talar neck (n = 19): – Hawkins type 2b, 8 – Hawkins type 3, 11 No treatment n = 10 Lost to follow-up n=2 Follow-up (n = 8): – Hawkins type 2b, 5 – Hawkins type 3, 3 SEFAS 29 point Collapse, 3 Arthritis, 7 Arthrodesis, 2

Treatment n=9 Lost to follow-up n=1 Follow-up (n = 8): – Hawkins type 2b, 2 – Hawkins type 3, 6 SEFAS 21 point Collapse, 3 Arthritis, 8 Arthrodesis, 1

Figure 1. Flowchart of patient selection, treatment, and follow-up.

Figure 2. 3-D CT reconstruction of fracture types represented in the study population: Hawkins type 2b (A) and type 3 (B).

tive treatment after a fracture (Abrahamsen 2010, Li et al. 2015), our department has encouraged postoperative treatment with antiresorptives after displaced fractures of the talar neck. In this retrospective analysis we evaluated the effects of this recommended treatment on the risk of talar collapse, the development of post-traumatic osteoarthritis (OA), and patient-reported outcome.

with fractures of: Hawkins type 1 (n = 2); type 2a (displaced, but not dislocated subtalar joint) (n = 7) type 2b (dislocation of the subtalar joint but no tibio-talar dislocation; Figure 2a) (n = 8); and type 3 (both dislocation of the subtalar and the tibio-talar joint; Figure 2b) (n = 11). No type 4 fracture was identified. The numbers of patients with associated fractures of the talar dome were: 3 in the treatment group; and 2 in the group without antiresorptive treatment. The 19 patients with type 2b (n = 8) and type 3 (n = 11) fractures were called for final follow-up between January and November 2017. Of the 19 patients who were called for follow-up, 16 patients (10 men) completed follow-up at a median of 12 years (IQR 7–13). Of these, 1 patient abstained from radiographic followup. The median age of the 16 patients in the study was 36 years (IQR 25–49). The patients’ background characteristics are presented in Tables 1 and 2.

Patients and methods Patients This is an observational analysis of patients who were treated surgically for fractures of the talus at Linköping University Hospital in the period 2002–2014 (Figure 1). During this period, we treated 40 consecutive patients with fracture of the talus with internal fixation. Individual radiographs taken preand postoperatively were collected and archived in our institutional Picture Archiving and Communication System. Fracture classification Initial CT scans were used to classify fractures according to the modified Hawkins Classification (Vallier et al. 2014) (AM). Of the 40 patients in the register, 12 patients with fracture of the lateral process of the talus or without fracture of the talar neck were excluded. The remaining 28 patients were independently classified by 2 reviewers (AM and LP) who were blinded to treatment with antiresorptive drugs. The 2 reviewers showed excellent agreement for fracture types 1–3 in 26 cases. The remaining 2 patients were classified after consensus agreement. The final classification yielded patients

Treatment Surgical fixation was performed with either 1 or 2 screws inserted posterior (n = 12) or anterior (n = 7). Plate osteosynthesis was not used. Postoperatively, the patients were immobilized with a below-knee cast without weight-bearing for 6 to 8 weeks. Based on the preference of the responsible surgical team, a treatment protocol with oral alendronate at 70 mg weekly was initiated within 14 days of the surgery and continued for 6 to 12 months (Table 1). At the end of the study period, the treatment protocol was switched to a single dose of 60 mg denosumab administered intravenously. Of the 19 patients with type 2b and type 3 fractures, 8 patients received alendronate and 1 received denosumab. The remaining 10 patients did not

Acta Orthopaedica 2021; 92 (4): 455–460

457

Table 1. Patients’ characteristics and patient-reported outcomes using SEFAS and radiographic evaluation (CT) at final follow-up Trauma Age Sex Type a Associated injuries same foot mechanism 31 Male 3 26 Male 3 46 Female 3 61 Male 3 45 Male 3 38 Female 2b 49 Female 2b 22 Male 3 34 Male 3 55 Female 3 66 Male 3 19 Male 2b 60 Male 2b 25 Female 2b 15 Female 2b 21 Male 2b 37 Male 2b 28 Male 3 22 Male 3

Ankle fracture Talus body Talus body Ankle fracture Ankle fracture Open talus body fracture Talus body, cuneiform, cuboid, navicular, and ankle fracture None Ankle fracture Ankle fracture None Ankle fracture None Talar body fracture None Talus body Open distal tibia, calcaneus, and navicular fracture None None Ankle fracture

a Based on the modified b A = alendronate

Antiresorptive Follow-up treatment b (years) Radiological findings

SEFAS score

MVA Fall > 2 m Fall > 2 m Crush injury Fall c MVA

A, 6 months A, 6 months A, 6 months A, 6 months A, 12 months A, 6 months

12.7 12.7 2.4 5.8 7.3 11.9

Osteophytes Partial collapse, osteophytes Partial collapse, osteophytes, JSN Collapse with sequestration JSN JSN, arthrodesis tibia-talus JSN

26 14 24 12 32 11

MVA MVA MVA Low energy Fall > 2 m MVA Low energy Fall > 2 m MVA MVA

A, 12 months Denosumab No No No No No No No No

4.5 2.3 11.3 12.4 14.4 11.4 11.8 12.7 13.4 9.6

JSN Osteophytes Osteophytes JSN, arthrodesis talus-calcaneus Collapse, arthrodesis, tibia-talus-calcaneus Osteophytes Partial collapse, osteophytes Osteophytes Noradiographs available Collapse, osteophytes

17 31 24 17 36 38 24 30 43 16

Low energy MVA MVA

No No A

No follow-up No follow-up No follow-up

Hawkins classification.

MVA = motor vehicle accident; JSN = joint space narrowing.

Table 2. Outcomes for the cohort as a whole and for the two groups. Factor n Collapse Overall Treatment Type 2b Type 3 No treatment Type 2b Type 3

16 8 2 6 8 5 3

6 3 0 3 3 2 1

SEFAS mean (SD)

Mean age

Male sex

25 (9.7) 21 (8.5) 14 (4.2) 23 (8.4) 29 (9.9) 30 (10.8) 26 (9.6)

38 40 44 39 37 28 52

10 5 0 5 5 3 2

receive any antiresorptive treatment with either alendronate or denosumab (see Figure 1). The prescription of antiresorptive drugs and the duration of treatment were obtained through a review of the medical charts and registered individual drug prescriptions. These pieces of information were obtained from the archived medical charts of patients treated between 2002 and 2008, and from the digitized medical charts and prescription system from 2008 and thereafter. For evaluation of the outcome at the final follow-up (16 available patients) the patients were divided into 2 groups. The first group comprised those who received postoperative antiresorptive treatment (n = 8) and the second group contained those who did not receive such treatment (n = 8) (see Figure 1).

Outcome measures The follow-up CT images were compared with the postoperative CT images in all planes to determine the degree of collapse (partial or total flattening or sequestration; Figures 3–5) of the talar dome. Osteoarthritis (OA) was defined as the occurrence of osteophytes on the joint surfaces or joint space narrowing of the talus or tibia (tibiotalar OA) or talus and calcaneus/navicular bone (subtalar OA). The SEFAS system (Cöster et al. 2012) was used to evaluate patient-reported outcomes at the final follow-up. SEFAS is the validated Swedish modified translation of the New Zealand ankle questionnaire (Hosman et al. 2007). Statistics We attempted to calculate the power of our sample to reject the null hypothesis. Due to the heterogeneity of the results obtained in previous studies and the small size of our cohort, we eventually refrained from a post-hoc power calculation. The SEFAS score is reported as the mean and standard deviation (SD) in accordance with previous publications (Cöster et al. 2017). Student’s t-test and relative risk (RR) with 95% confidence intervals (CI) were calculated to test differences in the outcome variables between groups. Ethics, funding, and potential conflicts of interest The study was approved by the local Ethical Review Board (Dnr 2016/317-31), and all patients gave informed consent.

458

Figure 3. CT of a Hawkins type 3 fracture (A) in a 26-year-old man after a fall from a ladder. The fracture was treated with screw fixation (B), a non-weight-bearing cast for 8 weeks, and alendronate for 6 months. After 12.5 years, the fracture appeared to be healed on CT, showing post-traumatic talocrural OA and partial collapse of the talar dome (C).

This study was supported by ALF-grants from Region Östergötland, Sweden. We thank the Knut and Alice Wallenberg Foundation for generous support. None of the authors has any conflict of interest.

Results In the radiographic follow-up, 6 (3 in each group) out of 15 patients (1 patient without treatment and a type 2b fracture had no radiographic follow-up) showed at least partial collapse of the talus, RR 0.9 (CI 0.4–2.0). For the 16 patients with available SEFAS scores at final follow-up, the mean score was 25 points (CI 20–30). For patients who received antiresorptive treatment, the mean SEFAS was 21 (CI 15–26), as compared with 29 points (CI 22–35) in the group without antiresorptive treatment. In all patients, post-traumatic OA was observed in at least 1 of the 3 possible joints. Arthrodesis was performed on 1 patient in the treatment group and on 2 patients in the untreated group (Figure 1 and Table 1).

Acta Orthopaedica 2021; 92 (4): 455–460

Figure 4. CT demonstrating a Hawkins type 3 fracture of the talus and associated medial malleolus fracture (A) in a 61-year-old man after a crush injury suffered during forest work. The fracture was treated with screw fixation (B), non-weight-bearing cast immobilization, and alendronate for 6 months. At 5.8 years after the surgery, the fracture had healed and the subchondral bone in the talar dome showed fragmentation due to collapse (C). Despite having poor ankle function (SEFAS of 12), the patient declined further surgery.

Discussion In this observational analysis, we tested the hypothesis that pharmacologic inhibition of osteoclast function might reduce the risk of structural collapse after long-term follow-up. We found similar radiologic and patient-reported outcome in the 2 groups. 3 patients in each group showed at least partial collapse of the talar dome and all patients showed post-traumatic OA in at least 1 of the 3 examined joints. The rate of talar collapse in our study (6/16) is roughly twice that seen in a recent retrospective study of patients with fracture types 2b and 3 (8/43) (Vallier et al. 2014). This large difference may be related to multiple factors. The follow-up period in our study was much longer (median of 12 years compared with a mean of 2.5 years), and the sensitivity to detect areas of structural collapse using CT scans in our study is likely to be higher than that of the plain

Acta Orthopaedica 2021; 92 (4): 455–460

Figure 5. CT reconstruction demonstrating a Hawkins type 2b fracture with comminution of the talar dome (A) in a 60-year-old man after a low-energy fall. The fracture was reduced and fixed with 2 screws (B), followed by 6 weeks in a non-weight-bearing cast. No antiresorptive treatment was given. At 11.8 years postoperatively, the fracture was healed and the talar dome showed partial collapse (C). (SEFAS of 24).

radiographs used in the Vallier study. Most previous studies did not evaluate collapse but looked at AVN only as defined by Hawkins, which makes comparisons with our study difficult. Moreover, the methods used for evaluating AVN have ranged from plain radiographs to MRI, and the definition of collapse differs across studies and among examiners (Dodd and Lefaivre 2015). In contrast to our hypothesis, the antiresorptive treatment administered in our study does not seem to affect the overall rate of structural collapse. We found a 100% risk of post-traumatic OA in at least 1 of the examined joints (tibiotalar joint, subtalar joint, talonavicular joint). In this respect, there was no statistically significant difference between the 2 groups. This overall rate of OA is higher compared with those reported previously: 36/66 in the Vallier study (Vallier et al. 2014) and 68% in a systematic review (Halvorson et al. 2013). However, it is similar to the OA rate reported in a recent meta-analysis, in which 81% of patients showed signs of subtalar joint OA after more than 2 years of follow-up. In that study, talocrural OA was not included, and the authors noted that the majority of the

459

included studies used only plain radiographs for the evaluation of OA. The high rate of post-traumatic OA we observed may be attributable to the long-term follow-up and thorough evaluation using CT. Considering the high number of patients who suffered talar collapse and OA, the self-reported poor outcomes of the patients on the SEFAS are not surprising. We did not investigate the development of AVN in the early postoperative phase, and we administered treatment irrespective of whether changes indicative of AVN were present or absent. The rationale here was to administer treatment before any bone resorption could occur. However, given the situation of interrupted perfusion after trauma, it might be more reasonable to initiate treatment once signs of revascularization become visible (Young et al. 2012). At this stage, the drug could reach the damaged area when bone resorption peaks and, thus, could inhibit osteoclast activity more effectively. Such an approach would be similar to the situation of atraumatic AVN of the femoral head, for which several clinical follow-up studies have shown a protective effect on structural collapse of the femoral head (Lai et al. 2005, Nishii et al. 2006, Agarwala et al. 2009, Agarwala and Vijayvargiya 2019). However, randomized trials using bisphosphonates to prevent femoral head collapse did not find a decrease in the need for arthroplasty (Chen et al. 2012, Lee et al. 2015). The main strength of our study is its focus on the most severe types of talar neck fractures that have a high risk of structural collapse. In addition, the long-term follow-up, the high rate of patient participation in follow-up, and the rigorous evaluation of post-traumatic changes using CT ensure that our results are highly relevant for the clinical setting. Our radiographic analysis is based on comparisons of postoperative and follow-up CT scans with a higher sensitivity to detect collapse and OA, as compared with plain radiographs. There are several limitations of our study. Treatment was not randomized but only recommended. Only half of the patients who could have received treatment were treated. Therefore, patients who received treatment may have been selected by surgeons with a higher level of experience and awareness of complications related to these injuries, which could have led to assignment of the worst cases to the treatment group. The treatment dosage and duration were empirical, based on experience gained from previous research. Patient compliance with the treatment regimen was questioned but was not registered in a logbook. Furthermore, our results might be biased by confounding elements. Because of the small sample size, we were not able to correct for potential confounders in this heterogeneous patient population. In conclusion, for patients with displaced fractures of the talar neck, we found no effect of antiresorptive therapy on the rate of talar collapse, post-traumatic osteoarthritis, and patient-reported outcomes. However, because of the retrospective nature of the study and wide confidence intervals of our risk estimates, these results do not provide final evidence.

460

AM: Study design, review of radiographs and data analysis, writing and revision of the manuscript. LP: Study design, review of radiographs and data analysis, review of the manuscript. JS: Data analysis, writing and revision of the manuscript. PA: Original study idea. Abrahamsen B. Adverse effects of bisphosphonates. Calcif Tissue Int 2010; 86(6): 421-35. doi: 10.1007/s00223-010-9364-1. Agarwala S, Vijayvargiya M. Bisphosphonate combination therapy for non-femoral avascular necrosis. J Orthop Surg Res 2019; 14(1): 112. doi: 10.1186/s13018-019-1152-7. Agarwala S, Shah S, Joshi V R. The use of alendronate in the treatment of avascular necrosis of the femoral head. J Bone Joint Surg Br 2009; 91-B(8): 1013-8. doi: 10.1302/0301-620X.91B8.21518. Annappa R, Jhamaria N L, Dinesh K V, Devkant, Ramesh R H, Suresh P K. Functional and radiological outcomes of operative management of displaced talar neck fractures. Foot (Edinb) 2015; 25(3): 127-30. doi: 10.1016/j.foot.2015.03.004. Chen C H, Chang J K, Lai K A, Hou S M, Chang C H, Wang G J. Alendronate in the prevention of collapse of the femoral head in nontraumatic osteonecrosis: a two-year multicenter, prospective, randomized, doubleblind, placebo-controlled study. Arthritis Rheum 2012; 64(5): 1572-8. doi: 10.1002/art.33498. Cöster M, Karlsson M K, Nilsson J A, Carlsson A. Validity, reliability, and responsiveness of a self-reported foot and ankle score (SEFAS). Acta Orthop 2012; 83(2): 197-203. doi: 10.3109/17453674.2012.657579. Cöster M C, Nilsdotter A, Brudin L, Bremander A. Minimally important change, measurement error, and responsiveness for the SelfReported Foot and Ankle Score. Acta Orthop 2017; 88(3): 300-4. doi: 10.1080/17453674.2017.1293445. Dodd A, Lefaivre K A. Outcomes of talar neck fractures: a systematic review and meta-analysis. J Orthop Trauma 2015; 29(5): 210-5. doi: 10.1097/ BOT.0000000000000297. Glimcher M J, Kenzora J E. The biology of osteonecrosis of the human femoral head and its clinical implications, II: The pathological changes in the femoral head as an organ and in the hip joint. Clin Orthop Relat Res 1979; (139): 283-312.

Acta Orthopaedica 2021; 92 (4): 455–460

Halvorson J J, Winter S B, Teasdall R D, Scott A T. Talar neck fractures: a systematic review of the literature. J Foot Ankle Surg 2013; 52(1): 56-61. doi: 10.1053/j.jfas.2012.10.008. Hofstaetter J G, Wang J, Yan J, Glimcher M J. Changes in bone microarchitecture and bone mineral density following experimental osteonecrosis of the hip in rabbits. Cells Tissues Organs 2006; 184(3-4): 138-47. doi: 10.1159/000099620. Hosman A H, Mason R B, Hobbs T, Rothwell A G. A New Zealand national joint registry review of 202 total ankle replacements followed for up to 6 years. Acta Orthop 2007; 78(5): 584-91. doi: 10.1080/17453670710014266. Lai K-A, Shen W-J, Yang C-Y, Shao C-J, Hsu J-T, Lin R-M. The use of alendronate to prevent early collapse of the femoral head in patients with nontraumatic osteonecrosis: a randomized clinical study. J Bone Joint Surg 2005; 87(10): 2155-9. doi: 10.2106/jbjs.D.02959. Lee Y K, Ha Y C, Cho Y J, Suh K T, Kim S Y, Won Y Y, Min B W, Yoon T R, Kim H J, Koo K H. Does zoledronate prevent femoral head collapse from osteonecrosis? A prospective, randomized, open-label, multicenter study. J Bone Joint Surg Am 2015; 97(14): 1142-8. doi: 10.2106/JBJS.N.01157. Li Y T, Cai H F, Zhang Z L. Timing of the initiation of bisphosphonates after surgery for fracture healing: a systematic review and meta-analysis of randomized controlled trials. Osteoporos Int 2015; 26(2): 431-41. doi: 10.1007/s00198-014-2903-2. Luo R B, Lin T, Zhong H M, Yan S G, Wang J A. Evidence for using alendronate to treat adult avascular necrosis of the femoral head: a systematic review. Med Sci Monit 2014; 20: 2439-47. doi: 10.12659/ MSM.891123. Nishii T, Sugano N, Miki H, Hashimoto J, Yoshikawa H. Does alendronate prevent collapse in osteonecrosis of the femoral head? Clin Orthop Relat Res 2006; 443: 273-9. doi: 10.1097/01.blo.0000194078.32776.31. Vallier H A, Reichard S G, Boyd A J, Moore T A. A new look at the Hawkins classification for talar neck fractures: which features of injury and treatment are predictive of osteonecrosis? J Bone Joint Surg Am 2014; 96(3): 192-7. doi: 10.2106/JBJS.L.01680. Young M L, Little D G, Kim H K. Evidence for using bisphosphonate to treat Legg-Calvé-Perthes disease. Clin Orthop Relat Res 2012; 470(9): 2462-75. doi: 10.1007/s11999-011-2240-0.

Acta Orthopaedica 2021; 92 (4): 461–467

461

Determining the development stage of the ossification centers around the elbow may aid in deciding whether to use ESIN or not in adolescents’ forearm shaft fractures Markus STÖCKELL 1, Tytti POKKA 1, Nicolas LUTZ 2, and Juha-Jaakko SINIKUMPU 1 1 Department

of Children and Adolescents, Oulu University Hospital; PEDEGO Research Group, Oulu Childhood Fracture and Sports Injury Study, Oulu University and Oulu University Hospital; Medical Research Council, Oulu University, Oulu, Finland; 2 Department of Pediatric Surgery, Lausanne University Hospital, Lausanne, Switzerland Correspondence: markus.stockell@hotmail.fi Submitted 2020-07-03. Accepted 2021-03-22.

Background and purpose — Elastic stable intramedullary nailing (ESIN) is the preferred method of operative stabilization of unstable pediatric forearm shaft fractures. However, the decision whether to use ESIN or open reduction and internal fixation (ORIF) in older children or teenagers is not always straightforward. We hypothesized that the development stage of the elbow would aid in evaluating the eligibility of the patient for ESIN. Patients and methods — All eligible children, aged <16 years who were treated with ESIN in Oulu University Hospital, during 2010–2019 were included (N = 70). The development stages of 4 ossification centers were assessed according to the Sauvegrain and Diméglio scoring. The proportion of impaired union vs. union was analyzed according to bone maturity, by using the optimal cutoff-points determined with receiver operating characteristics (ROC). Results — Development stage ≥ 6 in the olecranon was associated with impaired union in 20% of patients, compared with none in stages 1–5 (95% CI of difference 8% to 24%). Trochlear ossification center ≥ 4 was associated with impaired union in 17% of patients (CI of difference 7% to 36%) and lateral condyle ≥ 6 in 13% of patients (CI of difference 3.4% to 30%). Proximal radial head ≥ 5.5 was associated with impaired union in 18% of patients (CI of difference 7% to 39%). Interpretation — Recognizing the rectangular or fused olecranon ossification center, referring to stage ≥ 6, was in particular associated with impaired fracture healing. This finding may aid clinicians to consider between ESIN and plating, when treating forearm shaft fracture of an older child or teenager.

Pediatric forearm shaft fractures comprise 6% of all childhood fractures. They occur most frequently in children aged 5–14 years (Wall 2016, Joeris et al. 2017, Alrashedan et al. 2018). Most can be treated nonoperatively, and this is particularly recommended in children < 9 years (Zionts et al. 2005, Franklin et al. 2012). Older children are more prone to complications such as nonunion and redisplacement (Asadollahi et al. 2017). Their longer fracture healing time and less pronounced remodeling capacity have resulted in a trend toward operative management recently (Sinikumpu et al. 2012). Elastic stable intramedullary nailing (ESIN) is the preferred method to fix forearm shaft fractures in children. The method spares periosteal blood supply and surgical wounds are usually far from the fracture. ESIN produces good angular and longitudinal stability (Wall 2016). In older children and teenagers open reduction and internal fixation (ORIF) is optional (Herman and Marshall 2006). Their fractures are more prone to complications and even minor displacement may result in shortening and angulation, thus decreasing pro- and supination, similarly to adult patients (Rehman and Sokunbi 2010). However, the calendar age of a patient does not always match the maturation of the skeleton, making it challenging to select between pediatric-like or adult-like treatment. Bone age of the patient would help the clinician when choosing between ESIN and plating in older children. Bone age could be assessed by additional radiographs of the hand or iliac spine. However, keeping in mind that there are several ossification centers in the elbow, which develop in a particular order in a growing child, we hypothesized that higher development stage of elbow ossification centers would be associated with impaired healing of forearm shaft fractures stabilized by

462

Acta Orthopaedica 2021; 92 (4): 461–467

Figure 2. The elbow has been captured on the lateral view of the forearm radiograph and the secondary ossification center of the olecranon has been marked with a border (dotted line). The patient is a male aged 12. According to the Sauvegrain and Diméglio method the olecranon ossification stage with a rectangular shape is 6. This stage was found to be a cutoff point in association with disturbed bone union after ESIN of forearm shaft fracture.

Figure 1. Modified illustration of the Sauvegrain and Diméglio classification of maturation of the secondary ossification centers around the elbow. This staging was used in this research to find the optimal cutoffpoint for (A) olecranon, (B) proximal radial head, (C) lateral condyle of the humerus, and (D) trochlea of the humerus.

ESIN. We aimed to find a method to predict impaired union of forearm shaft fractures treated by ESIN, by using the Sauvegrain classification system for bone age (Sauvegrain et al. 1962).

Patients and methods This is a population-based study that included all eligible consecutive patients, aged less than 16 years, who had been treated for a forearm shaft fracture using ESIN in the study center. The study center is the only full-time pediatric trauma unit in the geographical catchment area of Oulu region, in Northern Finland. The child population at risk in the study area is approximately 87,000 annually. All primary and follow-up radiographs were reviewed to confirm inclusion; only diaphyseal both-bone fractures treated by titanium alloy elastic stable intramedullary nails were included and AO classification was used to subgroup the patients (Slongo et al. 2007). If more than 1 forearm shaft fracture was found in the same patient, the 1st of them was included for analysis. There were 2 such patients. No patient had a pathological fracture or bone dysplasia. Altogether 112 patients were primarily reviewed and finally 70 of these met

the inclusion criteria. 36 patients were treated surgically by using Kirschner wires or other straight nails, intramedullary or other surgical procedures, and they were therefore excluded. 4 patients were excluded due to incorrect diagnosis and 2 more had an isolated fracture of the ulna. Altogether 4 impaired unions were detected; 1 patient with nonunion had undergone an ossifying operation. 3 patients showed delayed union but ossified finally, after 6, 8, and 11 months of follow-up. Insufficient or lacking callus, fracture line visibility, or lacking cortical healing were assessed according to the Lane–Sandhu score (Bhandari et al. 2002). The patients who suffered from nonunion were immobilized for a mean 5.3 weeks, compared with 4.0 weeks among patients with expected fracture union. In the nonunion group, all patients had a nail–intramedullary canal ratio of at least 0.4, while 4.9% of cases in the union group were stabilized by using thinner nails with a nail–intramedullary canal ratio of < 0.4. There was no difference in the preferred orientation of the tips of the nails in the radius and ulna in comparison, demonstrating correct direction of the pre-bent nails, while none in the nonunion group and 16 patients in the union group achieved this in textbook fashion. The patients were mean 9.8 years (2–15) of age. Half of them were boys (n = 36). 43 were classified as type 22-D/5.1, 14 were type 22-D/5.2, 11 were type 22-D/4.1, and 2 were type 22-D4.2. Trampoline jumping was the most frequent cause of injury (26): 23 from a fall < 1 m and 15 from more than 1 m. 12 children were treated surgically following failure of nonoperative treatment (loss of reduction). We examined the association between impaired union and the development status of the elbow ossification centers: olecranon, trochlea, lateral condyle, and proximal radius. The maturation stage, referring to bone age, was classified by using the

Acta Orthopaedica 2021; 92 (4): 461–467

Table 1. Sensitivity, specificity, and predictive values of the elbow ossification centers, according to Sauvegrain and Diméglio development stage with particular cutoff points, in determining impaired union of forearm shaft fractures treated with elastic stable intramedullary nailing (ESIN)

Sauvegrain stages 10 9 8

Impaired union: no yes cut-off value

7 6 5

Area under ROC curve Se (CI) a

Olecranon Trochlea Lateral humeral condyle Proximal radial head

4 3 2 1 0

463

Olecranon Trochlea

Lateral humeral condyle

Proximal radial head

0.84 1 (0.40–1) 0.85 1 (0.40–1) 0.84 1 (0.40–1) 0.87 0.71 (0.40–1)

Sp (CI) 0.75 (0.64–0.85) 0.70 (0.57–0.80) 0.61 (0.48–0.72) 0.73 (0.60–0.83)

PPV (CI) 0.20 (0.06–0.44) 0.17 (0.05–0.37) 0.13 (0.04–0.31) 0.18 (0.05–0.40)

NPV (CI) a 1 (0.93–1) 1 (0.92–1) 1 (0.91–1) 1 (0.93–1)

ROC = receiver operating characteristic, Se = sensitivity, Sp = specificity, PPV = positive predictive value, NPV = negative predictive value, CI = 95% confidence interval. a one-sided CI.

Figure 3. Optimal cutoff points (lines) of Sauvegrain and Diméglio stages, according to area under receiver operating curve (ROC). Green dots are the fractures that united and red dots are the fractures that showed impaired union.

Sauvegrain method, modified by Diméglio (Sauvegrain et al. 1962, Diméglio et al. 2005, Charles et al. 2007) (Figure 1). In case of uncertain selection between 2 consecutive classification groups, the patient was classified to the higher one. In the olecranon ossification center, rectangular apophysis demonstrated stage 6 in the lateral view of the radiographs (Figure 2). Statistics The receiver operating characteristic (ROC) curve was calculated to find the cutoff-points for Sauvegrain stages of 4 different ossification centers that identify impaired ossification. We also try to determine optimal cutoff value for calendar age; 9 years of age was indicative but statistical significance was not reached. Using the cutoff points of Sauvegrain classification for impaired union, diagnostic accuracy of the classification was evaluated by calculating sensitivity, specificity, positive predictive value, and negative predictive value with their 95% confidence intervals (CI). Furthermore, a standardized normal distribution (SND) exact test was used to compare the proportions of nonunion in classes defined by the cutoff values of olecranon stage and the indicative cutoff point for age (≥ 9 years) as well as sex. A logistic regression model was used to determine the risk of low-quality healing between the groups. The effect of higher chronological age on low-quality union was also tested by using age as a continuous variable in regression analysis. P < 0.05 was considered significant, requiring that all analyses were 2-sided. Statistics were calculated using StatsDirect Statistical Software (version 3.2.8, https://www. statsdirect.co.uk/, 2013) and the SPSS Statistical Package (version 26.0, IBM Corp, Armonk, NY, USA, 2019). Ethics, funding, and potential conflicts of interest Following official instructions by the Ethical Board of Northern Finland Hospital District, Oulu, Finland, ethical board evaluation was waived and no ethics committee approval was

needed. The approval by the local institution was obtained prior to study initiation. National research funding (VTR) was obtained for the study. Foundation of Pediatric Research have supported the study. This was a researcher-initiated study with no commercial conflict of interest. JJS was in receipt of a grant from the Pediatric Research Society, Alma and K.A. Snellman Foundation, Emil Aaltonen Foundation. NL and JJS are members of the European Paediatric Orthopaedic Society. The other authors declare no conflicts of interest.

Results (Figure 3) Olecranon ossification center The optimal cutoff-point of the olecranon ossification center was 6, according to the ROC curve. All patients who suffered from impaired bone healing were identified with Sauvegrain stage 6–7 (sensitivity (Se) = 100%, CI 40–100%) (Table 1). The development stage ≥ 6 of the olecranon in the primary radiographs was associated with impaired union in 4 out of 20 patients treated with ESIN, compared with none in olecranon stage 1–5 (CI of difference 8–42%). There was no difference in open fractures, higher displacement (> 5 mm vs. ≤ 5 mm) or open reduction (yes vs. no) between the patients with olecranon stage 6–7 vs. 1–5 (Table 2). Trochlea ossification center With the ROC curve we found that the optimal cutoff point of the trochlear ossification center was 4. All fractures with impaired bone healing after ESIN were found to have trochlea stage 4–5 primarily. Sensitivity of this test was thus high (Se = 100%, CI 40–100%) in recognizing impaired union. Specificity (Sp) of test was 70% (CI 57–80%) (Table 1). Trochlear ossification center ≥ 4 was associated with impaired bone healing in 4 out of 24 patients (CI of difference 7–36%), compared with none in trochlear stage 0–3.5. There seemed to be more open fractures, but no difference in displacement (> 5 mm vs. ≤ 5 mm) or open reduction (yes vs. no) between the patients with trochlea stage 4–5 vs. 1–3.5 was found (Table 2).

464

Acta Orthopaedica 2021; 92 (4): 461–467

Table 2. Patients with higher vs. lower development stage of the elbow ossification centers were compared regarding severity of the fracture (higher displacement, open fracture) and open reduction

Olecranon Trochlea Lateral humeral condyle Proximal radial head Stage Stage Stage Stage Stage Stage Stage Stage 1–5 6–7 1–3.5 4–5 1–5 6–9 1–5 5.5–6 n = 50 n = 20 p-value n = 46 n = 24 p-value n = 40 n = 30 p-value n = 48 n = 22 p-value

Impaired union, n 0 4 0.003 0 4 0.006 0 4 0.015 0 4 0.004 Difference, % (95% CI) 20 (8.0–42) 17 (6.6–36) 13 (3.4–30) 18 (7.3–39) Severity of the fracture Open fracture, n 5/50 5/20 0.08 4/45 6/24 0.05 3/39 7/30 0.05 4/47 6/22 0.04 Displaced > 5 mm, n 27/45 12/19 1.0 26/41 13/23 0.5 19/35 20/29 0.2 26/43 13/21 1.0 Open reduction, n 28/49 10/20 0.5 25/45 13/24 1.0 21/39 17/30 1.0 25/47 13/22 0.6 Age in years, mean (SD) 8.9 (2.3) 12 (2.3) < 0.001 8.6 (2.2) 12 (2.1) < 0.001 8.5 (2.2) 12 (2.1) < 0.001 8.7 (2.3) 12 (2.0) < 0.001

Lateral humeral condyle ossification center Lateral humeral condyle ossification center stage 6 was found to be the optimal cutoff point according to ROC curve (Se = 100%, CI 40–100%). Specificity was 61% (CI 48–72%) (Table 1). The development stage ≥ 6 of the lateral humeral condyle was associated with impaired union in 4 out of 30 patients (CI 3–30%) who were treated with ESIN. There seemed to be more open fractures among higher ossification stages. No difference was found in displacement (> 5 mm vs. ≤ 5 mm) or open reduction (yes vs. no) between the patients with lateral humeral stage 6–9 vs. stage 1–5 (Table 2). Proximal radial head ossification centers The optimal cutoff point of Sauvegrain classification was 5.5 for the radial head, based on ROC curve. Sensitivity was 71% (CI 40–100%) and the specificity was 73% (CI 60–83%) (Table 1). The development stage ≥ 5.5 of the proximal radial head was associated with impaired union in 4 out of 22 patients (CI 7.3–39%) who were treated with ESIN. There were more open fractures, but no difference in displacement (> 5 mm vs. ≤ 5 mm) or open reduction (yes vs. no) between the patients with proximal radial head stage 5.5–6 vs. 1–5 was seen (Table 2). Effect of age and sex No statistically significant effect of inferior union was observed according to chronological age (p = 0.3) or between the age groups ≥ 9 years vs. < 9 years (p = 0.3). 1 out of 36 boys and 3 out of 34 girls presented low-quality fracture union (difference –6%, CI –21 to 7%).

Discussion The main finding of this study was that the higher development stage of the olecranon ossification center, in particular, can be used to estimate the probability of impaired union of a forearm shaft fracture, when treated with ESIN. By using the

method by Sauvegrain and Diméglio, we found that olecranon ossification center 6 or higher was associated with impaired ossification in 4 out of 20 patients. This means that a rectangular or fused olecranon ossification center rather than the more immature convex half-moon or lacking olecranon apophysis in the lateral radiographs may be associated with low-quality fracture healing, if ESIN is performed. In practical terms, the olecranon ossification center is easily seen on the conventional lateral view of the forearm radiograph and no extra radiographs are needed. 1 in 5 patients suffering from impaired ossification is a recognized number, given that most childhood fractures heal fast, meaning that our finding is clinically important. The fundamental idea of the study method is based on the growing skeleton. During the growth period, there is cartilage between the metaphysis and epiphysis of the long bones and the calcification of the epiphyses is lacking. The timing of calcification of these secondary ossification centers in the period of growth after birth differs, meaning that calendar age is not accurate method in evaluating the physiological maturation of the skeleton in individual children. In contrary, bone age per se describes the stage of skeletal development in all people, irrespective of calendar age, sex, or ethnic group (Satoh 2015). There are several methods for assessing bone status in children, while the Risser bone maturity classification is still a reference method in many institutions (Thodberg et al. 2010). In scoliosis treatment, the method of Sauvegrain et al. (1962) for the assessment of skeletal age with use of radiographs of the elbow has been used (Charles et al. 2007). In general, complete ossification of the elbow secondary ossification centers correlates with full bone maturity. Modified by Diméglio et al (2005), Sauvegrain’s method is based on systematic and regular morphological development of the elbow apophyses during the accelerating growth phase in puberty (Diméglio et al. 2005, Charles et al. 2007). Assessment of these apophyses allows skeletal age to be evaluated accurately at 6-month intervals. The time of appearance of ossification centers around the elbow seems not to vary between sexes (Cheng et al. 1998). A rectangular or fused olecranon apophysis, instead

Acta Orthopaedica 2021; 92 (4): 461–467

of halfmoon circular, minor, or lacking apophysis refers to stage 6 or higher. In these circumstances, it was reasonable to hypothesize that skeletal maturation of the elbow area would aid in selecting the optimal treatment of forearm shaft fractures between child-type and adult-type procedures. In addition to the olecranon ossification center, we found that the other 3 ossification centers, trochlea, lateral humeral condyle, and proximal radial head, were also associated with impaired union. This is reasonable, while all secondary ossification centers develop in a particular order and thus are related to bone age. However, although associated with impaired union, the other ossification centers had lower statistical variables when compared with the olecranon. From a clinical point of view, these other ossification centers may support the clinician as well, when he/she is considering the maturation stage of the patient’s skeleton. Our finding regarding other ossification centers is different from that of the report by Morrison et al. (2020), which is the only previous study of the issue. They reported on only the olecranon apophysis and its association with inferior results of ESIN in childhood forearm shaft fractures. In their study, olecranon stage > 3 (on the scale 1–7) was associated with increased complications in bone healing. In our study, the optimal cutoff point was higher: no patient presented disturbed healing if the olecranon development status was between 1 and 5. However, there was a difference in determining low-quality fracture healing between the studies: Morrison et al. used 6 months of impaired healing as nonunion, while the respective time was 12 months in our study. The study question of whether to use ESIN or other methods in older children with unstable forearm shaft fractures is clinically essential, bearing in mind that forearm shaft fractures are usual in that age group and their incidence is still increasing (Mäyränpää et al. 2010). There are many technical possibilities in performing operative stabilization, including open reduction and internal plate and screw fixation (ORIF) and intramedullary nailing with flexible nails (elastic stable intramedullary nailing, ESIN) (Bochang et al. 2005, Fernandez et al. 2010). Several implant materials have been used in surgical fixation of forearm shaft fractures, such as stainless steel, titanium alloy, and biodegradable composites (Van der Reis et al. 1998, Colaris et al. 2013, Korhonen et al. 2018). ESIN is currently the preferred method of surgical fixation in children, compared with ORIF, due to several advantages such as better cosmesis, decreased operative time, early return to activities, intact fracture hematoma, and good union rate (Lascombes et al. 1990, Schmittenbecher 2005). However, from the clinical point of view, there is still some controversy in the indications as to whether to use ESIN or plating in older children and adolescents, regardless of encouraging evidence in younger children (Ortega et al. 1996, Baldwin et al. 2014). The displacing muscle forces are greater in adolescents, the length of the shaft is higher resulting in greater torque at the fracture, the remodeling capacity is lower, and bone turnover rate is slower in adolescents, compared with younger children (Ortega et al.

465

1996, Sinikumpu and Serlo 2015). Thus, using the optimal method of treatment is particularly important in forearm midshaft fractures; as opposed to many other pediatric fractures, there is a risk of impaired healing in these bones, especially the ulna. Delayed union is reported to occur in 7% of patients and the risk of nonunion is 1–3 % (Mehlman and Wall 2006, Schmittenbecher et al. 2008, Sinikumpu et al. 2013). The ulna is a subcutaneous bone and more prone to nonunion than the radius (Fernandez et al. 2009). Furthermore, there is limited remodeling in the forearm diaphysis, which is far from the metabolically active growth plates. Malunited fractures tend to be associated with decreased forearm rotation, resulting in a more aggressive approach with surgical stabilization often being required (Kutsikovich et al. 2018). The strength of our study is that all children treated operatively for forearm shaft fractures during the recruitment period were enrolled. The patients were treated according to best available practice. Their treatment was based on the authentic decision of the treating surgeon. Another strength is that the patients with higher maturation stage of the ossification centers were similar regarding fracture severity and the need for open reduction, when compared with the more immature patients. This supports the hypothesis that no confusing fracture- or surgery-related factors affected the results, but impaired union was associated with a higher stage of elbow ossification centers. Follow-up was based on normal practice in the institution and there was no loss of follow-up. Another fact that emphasizes the value of this study is that there is no evidence-based level I or II data supporting either ORIF or ESIN in children’s forearm fractures (Abraham et al. 2011, Baldwin et al. 2014). The chronological calendar age cannot be used as a determinant for any surgical procedure in forearm shaft fractures. In our study, there was no statistically significant classifying value for optimal cutoff age point that would predict low-quality fracture union. Higher calendar age had no statistically significant effect on impaired fracture healing in the patients in this study. A limitation is that the material was collected retrospectively and no causality but only statistical association could be evaluated. In addition, the study model was not validated in any external dataset. There were several patients who were treated by other surgical methods, such as biodegradable implants or Kirschner wires, and thus were excluded. Further, no data of long-term outcomes was available. High sensitivity of the Sauvegrain method of olecranon classification means that stages 1–5 (negative result test) ruled out impaired union but a wide (97.5%) confidence interval and in particular its lower bound (40%) indicate that 100% sensitivity of the test overestimates the result. This means that not all patients with an ossification center higher than the determined optimal cutoff point suffered from impaired healing. The positive predictive value (PPV = 20%, CI 6–44%) means that the number of false-positive test results was high (80%). For these false positives, ESIN would still be the appropriate method of

466

treatment, resulting in good bone healing, which needs to be emphasized. From a clinical point of view, a minority of the patients with higher Sauvegrain stage of the olecranon suffered from impaired union. This highlights that ESIN is in general a superior method in treating pediatric forearm shaft fractures. The overall risk of disturbed healing of forearm shaft fractures treated with ESIN is in general low, and selection of the surgical procedure needs to be decided individually for every patient. However, a 20% of risk of impaired bone healing is still a high rate of impaired recovery in the growing skeleton and the method we describe may aid in improving the treatment of these particular patients: as a straightforward way to assess the bone maturity of each patient, this method could give additional information for a surgeon treating a child with a forearm shaft fracture. As another limitation, the number of patients with impaired bone healing was low, which justifies further studies in greater study settings. Larger studies are important to further analyze the effect of gender. However, in the authors’ understanding, the reported method could be feasible for both sexes, given that the bone maturation process itself is not dependent on gender, and the development of secondary ossification centers around the elbow does not differ between the sexes (Cheng et al. 1998, Satoh. 2015). In conclusion, in this study we found that the rectangular shape of olecranon maturation stage 6 or higher, in particular, seen on the lateral view of conventional forearm radiographs, can be used when considering the different treatment methods for older children and teenagers with forearm shaft fractures.

LK and JK contributed to data collecting. MP contributed in conception and performance of the study as well as drafting of the manuscript. Acta thanks Torsten Backteman and Johannes Mayr for help with peer review of this study.

Abraham A, Kumar S, Chaudhry S, Ibrahim T. Surgical interventions for diaphyseal fractures of the radius and ulna in children. Cochrane Database Syst Rev 2011; (11): CD007907. Alrashedan B S, Jawadi A H, Alsayegh S O, Alshugair I F, Alblaihi M, Jawadi T A, Hassan A A, Alnasser A M, Aldosari N B, Aldakhail M A. Outcome of diaphyseal pediatric forearm fractures following non-surgical treatment in a level I trauma center. Int J Health Sci 2018; 12: 60. Asadollahi S, Pourali M, Heidari K. Predictive factors for re-displacement in diaphyseal forearm fractures in children-role of radiographic indices. Acta Orthop 2017; 88(1): 101-8. Baldwin K, Morrison 3rd M J, Tomlinson L A, Ramirez R, Flynn J M. Both bone forearm fractures in children and adolescents, which fixation strategy is superior—plates or nails? A systematic review and meta-analysis of observational studies. J Orthop Trauma 2014; 28(1): e8-e14. Bhandari M, Guyatt G H, Swiontkowski M F, Tornetta 3rd P, Sprague S, Schemitsch E H. A lack of consensus in the assessment of fracture healing among orthopaedic surgeons. J Orthop Trauma 2002; 16(8): 562-6. Bochang C, Jie Y, Zhigang W, Weigl D, Bar-On E, Katz K. Immobilisation of forearm fractures in children: extended versus flexed elbow. J Bone Joint Surg Br 2005; 87(7): 994-6.

Acta Orthopaedica 2021; 92 (4): 461–467

Charles Y P, Diméglio A, Canavese F, Daures J. Skeletal age assessment from the olecranon for idiopathic scoliosis at Risser grade 0. J Bone Joint Surg Am 2007; 89(12): 2737-44. Cheng J C, Wing-Man K, Shen W Y, Yurianto H, Xia G, Lau J T, Cheung A Y. A new look at the sequential development of elbow-ossification centers in children. J Pediatr Orthop 1998; 18(2): 161-7. Colaris J W, Allema J H, Reijman M, Biter L U, De Vries M R, Van De Ven, C P, Bloem R M, Verhaar J A N. Risk factors for the displacement of fractures of both bones of the forearm in children. Bone Joint J 2013; 95-B(5) (5): 689-93. Diméglio A, Charles Y P, Daures J, De Rosa V, Kaboré B. Accuracy of the Sauvegrain method in determining skeletal age during puberty. J Bone Joint Surg Am 2005; 87(8): 1689-96. Fernandez F, Eberhardt O, Langendörfer M, Wirth T. Nonunion of forearm shaft fractures in children after intramedullary nailing. J Pediatr Orthop B 2009; 18(6): 289-95. Fernandez F, Langendörfer M, Wirth T, Eberhardt O. Failures and complications in intramedullary nailing of children’s forearm fractures. J Child Orthop 2010; 4(2): 159-67. Franklin C C, Robinson J, Noonan K, Flynn J M. Evidence-based medicine: management of pediatric forearm fractures. J Pediatr Orthop 2012; 32: 131. Herman M J, Marshall S T. Forearm fractures in children and adolescents: a practical approach. Hand Clin 2006; 22(1): 55-67. Joeris A, Lutz N, Blumenthal A, Slongo T, Audigé L. The AO pediatric comprehensive classification of long bone fractures (PCCF) part I: location and morphology of 2,292 upper extremity fractures in children and adolescents. Acta Orthop 2017; 88(2): 129-32. Korhonen L, Perhomaa M, Kyrö A, Pokka T, Serlo W, Merikanto J, Sinikumpu J. Intramedullary nailing of forearm shaft fractures by biodegradable compared with titanium nails: results of a prospective randomized trial in children with at least two years of follow-up. Biomaterials 2018; 185: 383-92. Kutsikovich J I, Hopkins C M, Gannon 3rd E W, Beaty J H, Warner W C, Sawyer J R, Spence D D, Kelly D M. Factors that predict instability in pediatric diaphyseal both-bone forearm fractures. J Pediatr Orthop B 2018; 27(4): 304-8. Lascombes P, Prevot J, Ligier J N, Metaizeau J P, Poncelet T. Elastic stable intramedullary nailing in forearm shaft fractures in children: 85 cases. J Pediatr Orthop 1990; 10(2): 167-71. Mäyränpää M K, Mäkitie O, Kallio P E. Decreasing incidence and changing pattern of childhood fractures: a population–based study. J Bone Miner Res 2010; 25(12): 2752-9. Mehlman C T, Wall E J. Injuries to the shafts of the radius and ulna. In: Beaty J H, Kasser JR, editors. Rockwood and Wilkins’ fractures in children. VI ed. Philadelphia: Lippincott Williams & Wilkins 2006. p. 399441. Morrison 3rd M J, Speirs J N, Chicorelli A M, Garner M, Flynn J J M, Herman M J. Intramedullary fixation of both bone forearm fractures in children and adolescents: healing correlates with development of the olecranon apophysis. J Pediatr Orthop 2020; 40(3): e198-e202. Ortega R, Loder R T, Louis D S. Open reduction and internal fixation of forearm fractures in children. J Pediatr Orthop 1996; 16(5): 651-4. Rehman S, Sokunbi G. Intramedullary fixation of forearm fractures. Hand Clin 2010; 26(3): 391-401. Satoh M. Bone age: assessment methods and clinical applications. Clin Pediatr Endocrinol 2015; 24(4): 143-52. Sauvegrain J, Nahum H, Bronstein H. Study of bone maturation of the elbow. Ann Radiol (Paris) 1962; 5: 542-50. Schmittenbecher P P. State-of-the-art treatment of forearm shaft fractures. Injury 2005; 36(1): 25. Schmittenbecher P, Fitze G, Gödeke J, Kraus R, Schneidmüller D. Delayed healing of forearm shaft fractures in children after intramedullary nailing. J Pediatr Orthop 2008; 28(3): 303-6. Sinikumpu J, Serlo W. The shaft fractures of the radius and ulna in children: current concepts. J Pediatr Orthop B 2015; 24(3): 200-6.

Acta Orthopaedica 2021; 92 (4): 461–467

Sinikumpu J, Lautamo A, Pokka T, Serlo W. The increasing incidence of paediatric diaphyseal both-bone forearm fractures and their internal fixation during the last decade. Injury 2012; 43(3): 362-6. Sinikumpu J, Pokka T, Willy S. The changing pattern of pediatric both-bone forearm shaft fractures among 86,000 children from 1997 to 2009. Eur J Pediatr Surg 2013; 23(04): 289-96. Slongo T F, Audigé L, AO Pediatric Classification Group. Fracture and dislocation classification compendium for children: the AO pediatric comprehensive classification of long bone fractures (PCCF). J Orthop Trauma 2007; 21(10 Suppl.): 135.

467

Thodberg H H, van Rijn R R, Tanaka T, Martin D D, Kreiborg S. A paediatric bone index derived by automated radiogrammetry. Osteoporos Int 2010; 21(8): 1391-400. Van der Reis W, Otsuka N Y, Moroz P, Mah J. Intramedullary nailing versus plate fixation for unstable forearm fractures in children. J Pediatr Orthop 1998; 18: 9-13. Wall L B. Staying out of trouble performing intramedullary nailing of forearm fractures. J Pediatr Orthop 2016; 36: 71. Zionts L E, Zalavras C G, Gerhardt M B. Closed treatment of displaced diaphyseal both-bone forearm fractures in older children and adolescents. J Pediatr Orthop 2005; 25(4): 507-12.

468

Acta Orthopaedica 2021; 92 (4): 468–471

Below-elbow cast sufficient for treatment of minimally displaced metaphyseal both-bone fractures of the distal forearm in children: long-term results of a randomized controlled multicenter trial Linde MUSTERS 1, Leon W DIEDERIX 2, Kasper C ROTH 3, Pim P EDOMSKIS 4, Gerald A KRAAN 5, Jan H ALLEMA 6, Max REIJMAN 3, and Joost W COLARIS 3 1 Department of Orthopedics, Noordwest Ziekenhuisgroep Alkmaar, The Netherlands; 2 Department of Orthopedics, Elkerliek Hospital, Helmond; 3 Department of Orthopedics, Erasmus MC, University Medical Centre, Rotterdam; 4 Department of Surgery, Erasmus MC, University Medical Centre, Rotterdam; 5 Department of Orthopedics, Reinier de Graaf, Delft; 6 Department of Surgery, Haga Hospital, The Hague, The Netherlands

Correspondence: l.musters@nwz.nl Submitted 2020-06-29. Accepted 2021-01-20.

Background and purpose — We have previously shown that children with minimally displaced metaphyseal bothbone forearm fractures, who were treated with a belowelbow cast (BEC) instead of an above-elbow cast (AEC), experienced more comfort, less interference in daily activities, and similar functional outcomes at 7 months’ follow-up (FU). This study evaluates outcomes at 7 years’ follow-up. Patients and methods — A secondary analysis was performed of the 7 years’ follow-up data from our RCT. Primary outcome was loss of forearm rotation compared with the contralateral forearm. Secondary outcomes were patient-reported outcome measures (PROMs) consisting of the ABILHAND-kids and the DASH questionnaire, grip strength, radiological assessment, and cosmetic appearance. Results — The mean length of FU was 7.3 years (5.9– 8.7). Of the initial 66 children who were included in the RCT, 51 children were evaluated at long-term FU. Loss of forearm rotation and secondary outcomes were similar in the 2 treatment groups. Interpretation — We suggest that children with minimally displaced metaphyseal both-bone forearm fractures should be treated with a below-elbow cast.

Long-term follow-up of children with forearm fractures is scarce but essential, because the remodeling capacity by growth can behave as a friend or an enemy. Previous studies with short-term follow-up shown that metaphyseal both-bone fractures of the distal forearm could safely be treated with a below-elbow cast (BEC) (Bohm et al. 2006, Webb et al. 2006, Paneru et al. 2010, Hendrickx et al. 2011, Colaris et al. 2012, Van Den Bekerom et al. 2012). Our previous randomized multicenter controlled trial compared BEC with above-elbow cast (AEC) for the treatment of minimally displaced metaphyseal both-bone fractures of the distal forearm in children. This RCT concluded that children with minimally displaced metaphyseal both-bone fractures of the distal forearm should be treated with a below-elbow cast (Colaris et al. 2012). We now report the long-term 7-year follow-up of these 2 treatment groups regarding loss of forearm rotation, patient-reported outcomes measures (ABILHAND-kids questionnaire and DASH questionnaire (Hudak et al. 1996, Penta et al. 1998, Arnould et al. 2004), grip strength, radiological assessment, and cosmetic appearance (Bohm et al. 2006, Paneru et al. 2010, Hendrickx et al. 2011, Colaris et al. 2012, Van Den et al. 2012)

Patients and methods Trial design and participants All patients who had been previously included between 2006 and 2010 in the RCT were invited to our outpatient clinic to determine long-term clinical outcomes with a minimum follow-up of 5 years (Colaris et al. 2012). These patients had © 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group, on behalf of the Nordic Orthopedic Federation. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. DOI 10.1080/17453674.2021.1889106

Acta Orthopaedica 2021; 92 (4): 468–471

Table 1. Baseline characteristics of the population. Values are count unless otherwise specified Baseline

Below Above elbow cast elbow cast

Number of children 25 Age at trauma (SD) 7.5 (1.4 ) Male sex 12 Dominant arm 5 Type of fracture, radius Buckle 0 Greenstick 16 Complete fracture 9 Type of fracture, ulna Buckle 2 Greenstick 19 Complete fracture 4

26 6.2 (1.4) 10 10 2 17 7

469

Table 2. Representation of follow-up population Outcome at 7 months follow-up

Lost to FU (95% CI) n = 15

Included (95% CI) n = 51

Total (95% CI) n = 66

Age at trauma, years Male sex, n Forearm rotation 7 months (°) Loss of rotation (°) ABILHAND-kids questionnaire (points) Complications (%) VAS cosmetics parents/child (0–10) VAS cosmetics surgeon (0–10)

7.9 (6.1–9.8) 8 148 (144–153) 4.9 (2.9–6.9) 41.4 13 9.6 9.8

6.8 (5.8–7.8) 29 139 (131–148) 4.3 (0.6–8.1) 41.7 16 9.4 9.7

7.1 (6.2–7.9) 37 146 (142–150) 4.8 (3.1–6.5) 41.6 15 9.4 9.7

4 19 3

been children with a minimally displaced metaphyseal fracture of the radius and ulna, who had been randomized between treatment with AEC or BEC. Informed consent was again obtained from all participants and from all the parents of children aged < 12 years. Outcomes measures Our primary outcome measure was loss of forearm rotation in comparison with the contralateral side. Secondary outcome measures were patient-reported outcome measures (PROMs): using the Dutch version of the DASH and ABILHAND-kids questionnaire, wrist and elbow range of motion, grip strength (using a JAMAR Dynamometer), VAS scores regarding cosmetic appearance (scars and angulation of the forearm), and radiological assessment of malunion (Hudak et al. 1996, Penta et al. 1998, Arnould et al. 2004). An orthopedic surgeon (LD) measured forearm rotation, flexion, and extension of wrist and elbow using visual estimation and a goniometer with increments of 2°. The follow-up was organized in the patient’s original hospital of inclusion. Both arms were examined to determine functional loss. Grip strength was measured using a Jamar dynamometer (Performance Health International, Sutton-in-Ashfield, UK), conducting one measurement comparing both arms. Patients were asked to fill in 2 PROMs, the DASH and the ABILHAND-kids questionnaire, and a VAS for cosmetic appearance. Cosmetic appearance was assessed by the patient, or by the parents in children < 12 years, and by the investigator (LD). The radiological assessment consisted of a anteroposterior and lateral radiograph of the wrist. One of the authors (PE), measured the angulation of the radius and ulna (Zimmermann et al. 2004, Jeroense et al. 2015). Statistics To evaluate whether the included patients in the current study are representative of the total initial study population of 66 patients, we compared the baseline characteristics, func-

tional outcome, and complications at short-term follow-up (7 months) between the included patients versus those lost to follow-up. Long-term results of primary and secondary outcome measures of the 2 treatment groups (AEC vs. BEC) were compared. Differences were analyzed using 1-way ANOVA to correct for multiple comparisons. Results are presented as mean SD or 95% confidence interval (CI). To assess the interrater reproducibility of radiographic assessment 2 authors (PE and LD) measured angulations of the radius and ulna of 25 cases (at cast removal and at final follow-up). Intra-class correlation coefficient was calculated. Statistical analyses were performed using IBM SPSS Statistics version 23. Ethics, registration, funding, and potential conflict of interest Ethics approval was obtained for this post-trial FU study from regional medical ethics committee (NL41839.098.12). The original RCT was registered in ClinicalTrials.gov NCT 00397995. The current author did not receive any funding. The author of the primary study received a grant from the Anna Foundation, the Netherlands. None of the authors declare any conflicts of interest.

Results Between 2006 and 2010, 66 children were included in the RCT by Colaris et al. (2012) and 51 of these children participated in the current study: 26 out of 31 patients who were allocated to AEC, and 25 out of 35 patients who were allocated to BEC. The mean length of follow-up was 7.3 years (5.9–8.7). Baseline characteristics of the groups and primary outcome, the loss of forearm rotation, showed no statistically significant differences between the 2 treatment groups (Tables 1 and 2) Secondary outcomes were similar between the 2 treatment groups (Table 3). No statistically significant differences were found in sagittal or coronal angulation of the radius and ulna in either group (Table 4). The interrater reproducibility of the radiological assessment showed an intra-class correlation of 0.83 (CI 0.57–0.94).

470

Acta Orthopaedica 2021; 92 (4): 468–471

Table 3. Data (95% confidence intervals) on primary and secondary outcomes at long-term follow-up Factor

Below elbow cast Above elbow cast n = 25 n = 26

Age at follow-up, years 14.9 (13.3–16.5) Follow-up length, years 7.5 (6.9–8.0) Loss of forearm rotation –0.72 (–4.5 to 3.1) Loss of wrist flexion–extension 0.80 (–3.2 to 4.8) ABILHAND-kids questionnaire (points) 41.4 (40.1–42.7) DASH score (points) 4.4 (0–13) Grip strength, kg 31 (17–45) VAS cosmetics parents (0–10) 9.4 (9.0–9.8) VAS cosmetics surgeon (0–10) 9.7 (9.5–9.9)

13.2 (11.8–14.7) 7.1 (6.5–7.7) 0.58 (–5.1 to 6.2) 0.58 (–2.4 to 3.5) 41.9 (41.5–42.3) 2.1 (0–6.9) 28 (14–42) 9.3 (8.9–9.7) 9.6 (9.4–9.9)

Discussion We present the results of a multicenter randomized controlled study with 7-year follow-up, concerning children with a minimally displaced metaphyseal both-bone fracture of the distal forearm who had been treated with either AEC or BEC. Shortterm follow-up of these patients at 7 months showed no statistically significant differences, except more cast comfort and less interference with daily activities in the group treated with BEC. The 7-year follow-up revealed similar outcome between the 2 groups concerning loss of forearm rotation, patientreported outcome measures (PROMs): DASH and ABILHAND-kids questionnaire, grip strength (using a JAMAR dynamometer), VAS scores regarding cosmetic appearance (scars and angulation of the forearm), and radiological assessment of malunion. Previous research with short-term follow-up A meta-analysis by Hendrickx et al. (2011) included 3 RCTs comparing AEC with BEC for the treatment of both-bone distal forearm fractures in 219 children. Secondary fracture displacement was seen in 15% in the BEC group and in 28% in the AEC group. An update of this meta-analysis, which included 2 more studies with 174 more children, found no treatment preference any longer. Concerning the plasterrelated complication rate, data was pooled and showed no difference between the 2 treatment strategies (Van Den Bekerom et al. 2012). Previous research with long-term follow-up Literature on long-term follow-up of nonoperative treatment of forearm fractures in children is scarce. A retrospective study of Zimmerman et al. (2004) included 220 children with 232 distal forearm fractures between 1980 and 1992. The mean age of included children was 9 years (1–16) and the mean time of follow-up 10 years (5–16). In 40 children both the radius and ulna were fractured. The purpose of this study was to investigate the frequency and extent of clinical and radio-

Table 4. Radiological angulation (°) (95% confidence intervals) View Anteroposterior Ulna Radius Lateral Ulna Radius

Below elbow cast Cast removal Final

Above elbow cast Cast removal Final

7 (2–12) 5 (2–8) 4 (0–8) 6 (3–9)

6 (2–10) 5 (2–8) 4 (0–8) 5 (1–9)

6 (2–10) 4 (1–7) 10 (5–15) 5 (2–8)

6 (3–9) 4 (2–6) 10 (5–15) 4 (1–7)

Cast removal at 6 weeks after casting

logical late sequelae and identify predicting factors. The overall outcome was very good in 72%, and children < 10 years of age showed more favorable results, even with a malunion. Children > 10 years of age with an angulatory deformity > 20 degrees and/or more than 50% displacement at consolidation showed more pain and less function. Further factors having a negative influence on the outcome were repeated reduction and an additional fracture of the ulna. We would like to address the ongoing debate on how much fracture angulation can be accepted at what age. The highest remodeling capacity is expected in young children with fractures close to the most active distal growth plate, and angulation in the sagittal plane. However, the literature on acceptable angulation in pediatric forearm fractures is scarce. Ploegmakers and Verheyen (2006) carried out a meta-analysis and together with the opinions of 18 international experts an effort was made to provide insight into the limits of acceptance of angular deformation in the nonoperative treatment of pediatric forearm fractures. More specifically for metaphyseal both-bone fractures of the distal forearm, the literature showed acceptable angulation of 11–18°, compared with 6–24° by the experts. Our primary inclusion criteria (fracture angulation < 15° in children < 10 years and < 10° in children ≥ 10 years) were in the range of these results. Our good clinical and radiological long-term follow-up results combined with previous literature (Ploegmakers and Verheyen 2006, Prommersberger and Lanz 2000) shows that metaphyseal both-bone fractures of the distal forearm especially in children < 12 years of age remodel satisfactorily (Prommersberger and Lanz 2000). Study limitations Our main limitation is the long-term follow-up percentage of only 77% of the primarily included children, but previous literature on acceptable loss to follow-up suggests that up to 40% loss to follow-up results in minimal attrition of the results (Kristman et al. 2004, Fewtrell et al. 2008). To address the potential effects of loss to follow-up we did a patient group analysis, which showed that the follow-up group was representative of the whole original study group. Limitations of the original study still apply. The reduction criteria were adjusted to only 2 age groups without gender

Acta Orthopaedica 2021; 92 (4): 468–471

distinction, and the reduction criteria were used for both metaphyseal and diaphyseal forearm fractures. Therefore, the reduction criteria for metaphyseal fractures in the youngest children, especially boys, could probably have been too strict. Conclusions At long-term follow-up we found similar loss of forearm rotation after treatment of minimally displaced metaphyseal both-bone fractures of the distal forearm in children treated with AEC or BEC. Furthermore patient-reported outcome measures and radiological assessment were similar. Based on short- and long-term results, we suggest that children with minimally displaced metaphyseal both-bone forearm fractures should be treated with a below-elbow cast. LM writing, data analysis, submitting; LD writing, data analysis; KR data analysis; GR reviewer, writing; JA reviewer, writing; JC reviewer, writing. Acta thanks Martin Gottliebsen and Klaus Dieter Parsch for help with peer review of this study.

Arnould C, Penta M, Renders A, Thonnard J L. ABILHAND-Kids: A measure of manual ability in children with cerebral palsy. Neurology 2004; 63 (6): 1045-52. doi: 10.1212/01.wnl.0000138423.77640.37. Bohm E R, Bubbar V, Yong Hing K, Dzus A. Above and below-the-elbow plaster casts for distal forearm fractures in children: a randomised controlled trial. J Bone Joint Surg Am 2006; 88 (1): 1-8. doi:10.2106/JBJS.E.00320. Colaris J W, Biter L U, Allema J H, Bloem R M, Van De Ven C P, De Vries M R, Kerver A J H, Reijman M, Verhaar J A N. Below-elbow cast for metaphyseal both-bone fractures of the distal forearm in children: a randomised multicentre study. Injury 2012; 43: 1107-11. doi:10.1016/j. injury.2012.02.020. Fewtrell M S, Kennedy K, Singhal A, Martin R M, Ness A, HaddersAlgra M, Koletzko B, Lucas A. How much loss to follow-up is acceptable

471

in long-term randomised trials and prospective studies? Arch Dis Child 2008; 93 (6): 458-61. doi:10.1136/adc.2007.127316. Hendrickx R P M, Campo M M, Van Lieshout A P W, Struijs P A A, Van Den Bekerom P J. Above- or below-elbow casts for distal third forearm fractures in children? A meta-analysis of the literature. Arch Orthop Traum Surg 2011; 131 (12): 1663-71. doi:10.1007/s00402-011-1363-9. Hudak P L, Amadio P C, Bombardier C. Development of an upper extremity outcome measure: the DASH (Dis- Abilities of the Arm, Shoulder and Hand). Am J Ind Med 1996; 29: 602-8. Jeroense K T V, America T, Witbreuk M M E H, Van Der Sluijs J A. Malunion of distal radius fractures in children: remodeling speed in 33 children with angular malunions of ≥ 15 degrees. Acta Orthop 2015; 86: 233-7. doi :10.3109/17453674.2014.981781. Kristman V, Manno M, Côté P. Loss to follow-up in cohort studies: how much is too much? Eur J Epidemiol 2004; 19 (8): 751-60. doi: 10.1023/b:ejep.0000036568.02655.f8. Paneru S R, Rijal R, Shrestha B P, Nepal P, Khanal G P, Karn N K, Singh M P, Rai P. Randomized controlled trial comparing above- and belowelbow plaster casts for distal forearm fractures in children. J Child Orthop 2010; 4 (3): 233-7. doi:10.1007/s11832-010-0250-1. Penta M, Thonnard J L, Tesio L. ABILHAND: a Rasch-built measure of manual ability. Arch Phys Med Rehabil 1998; 79 (9): 1038-42. doi:10.1016/ S0003-9993(98)90167-8. Ploegmakers J J W, Verheyen C C P M. Acceptance of angulation in the non-operative treatment of paediatric forearm fractures. J Pediatr Orthop B 2006; 15: 428-32. doi:10.1097/01.bpb.0000210594.81393.fe. Prommersberger K J, Lanz U. Fehlverheilte Frakturen des Unterarmes im Wachstumsalter unter besonderer Berucksichtigung der Unterarmlangsachse. Fallbeispiele. Handchir Mikrochir Plast Chir 2000; 32 (4): 250-9. doi:10.1055/s-2000-10934. Van Den Bekerom M P J, Hendrickx R H, Struijs P A. Above- or belowelbow casts for distal third forearm fractures in children? An updated metaanalysis of the literature. Arch Orthop Trauma Surg 2012; 132 (12): 181920. doi: 10.1007/s00402-012-1603-7. Webb G R, Galpin R D, Armstrong D G. Comparison of short and long arm plaster casts for displaced fractures in the distal third of the forearm in children. J Bone Joint Surg Am 2006; 88 (1): 9-17. doi:10.2106/JBJS.E.00131. Zimmermann R, Gschwentner M, Kral F, Arora R, Gabl M, Pechlaner S. Long-term results following pediatric distal forearm fractures. Arch Orthop Trauma Surg 2004; 124: 179-86. doi:10.1007/s00402-003-0619-4.

472

Acta Orthopaedica 2021; 92 (4): 472–478

of Orthopedic Sugery, Örebro University Hospital; 2 Department of Orthopedics, Örebro University; 3 Department of Orthopaedics, Skaraborg Hospital, Skövde, Sweden Correspondence: evelina.pantzar@regionorebrolan.se Submitted 2020-10-23. Accepted 2021-03-21. See Erratum regarding affiliations, page 500 (DOI: 10.1080/17453674.2021.1934801)

Background and purpose — The impact of knee flexion contracture (KFC) on function in cerebral palsy (CP) is not clear. We studied KFC, functional mobility, and their association in children with CP. Subjects and methods — From the Swedish national CP register, 2,838 children were defined into 3 groups: no (≤ 4°), mild (5–14°), and severe (≥ 15°) KFC on physical examination. The Functional Mobility Scale (FMS) levels were categorized: using wheelchair (level 1), using assistive devices (level 2–4), walking independently (level 5–6). Standing and transfer ability and Gross Motor Function Classification (GMFCS) were assessed. Results — Of the 2,838 children, 73% had no, 14% mild, and 13% severe KFC. KFC increased from 7% at GMFCS level I to 71% at level V. FMS assessment (n = 2,838) revealed around 2/3 were walking independently and 1/3 used a wheelchair. With mild KFC (no KFC as reference), the odds ratio for FMS level 1 versus FMS level 5–6 at distances of 5, 50, and 500 meters, was 9, 9, and 8 respectively. Correspondingly, with severe KFC, the odds ratio was 170, 260, and 217. In no, mild, and severe KFC 14%, 47%, and 77% could stand with support and 11%, 25%, and 33% could transfer with support. Interpretation — Knee flexion contracture is common in children with CP and the severity of KFC impacts function. The proportion of children with KFC rose with increased GMFCS level, reduced functional mobility, and decreased standing and transfer ability. Therefore, early identification and adequate treatment of progressive KFC is important.

Knee flexion contracture is a common problem in children with cerebral palsy (CP) (Miller 2005, Cloodt et al. 2018). Due to muscle imbalance, short and spastic hamstring muscles, and prolonged sitting posture, knee flexion contracture may develop and often progresses in adolescence (Miller 2005, Rodda et al. 2006). Although the exact impact of knee flexion contracture and its contribution to the development of flexed knee gait is still not fully understood, it is associated with progressive deterioration of gait in the ambulating child (Bell et al. 2002, Rodda et al. 2006) and it results in difficulties maintaining functional standing, sitting, and transfer in non-ambulatory children (Miller 2005, Cloodt et al. 2018). In addition, knee flexion contracture generates increased forces on the knee joint, which may cause pain (Rodda et al. 2006, Steele et al. 2012, Schmidt et al. 2020). Prevention of knee flexion contracture has not been thoroughly studied, and physiotherapy treatment and focal spasticity reduction have been attempted without convincing effect (Hägglund et al. 2005, Galey et al. 2017). In ambulatory children, there are several reports of improvement of gait pattern and knee flexion contracture after orthopedic surgery (Ma et al. 2006, Rodda et al. 2006, Stout et al. 2008, Taylor et al. 2016). These studies are limited mainly to children in Gross Motor Function Classification System (GMFCS) level I–III, and occasionally level IV, and varies across age groups as well as according to the surgery performed (Ma et al. 2006, Rodda et al. 2006, Stout et al. 2008, Taylor et al. 2016). The Functional Mobility Scale (FMS), the Pediatric Outcomes Data Collection Instrument (PODCI), and the Gross Motor Function Measure dimension D (GMFM D) are often used to assess function after orthopedic surgery; all three instruments describe how the child actually moves in daily life, and not necessarily what his or her capacity is (Russell 1993, Daltroy et al. 1998, Graham et al. 2004).

Acta Orthopaedica 2021; 92 (4): 472–478

473

Subjects retrieved from the Swedish national Cerebral Palsy register n = 3,381 Excluded (n = 543): – incomplete FMS assessment, 398 – no information on passive knee extension, 91 – no information on dominating neurological symptom, 39 – no information on sex, 15

tracture, FMS, standing, and transfer ability ratings were collected for all children with data for each measure.

Knee flexion contracture In the physical examination, passive knee extension was measured in a standardized way with a goniometer Subjects remaining for FMS analysis on the lateral side of the thigh and n = 2,838 shank in supine position, with the hip in full extension and the ankle Excluded Excluded No information on transfer ability No information on standing ability in plantar flexion. For this study, n = 45 n = 23 we defined knee flexion contracture of 5–14° as mild and 15° or more Remaining for transfer analysis Remaining for standing analysis n = 2,793 n = 2,815 as severe, based on the levels used in the CPUP and by Cloodt et al. Figure 1. Flow chart of inclusion. (2018). We classified knee flexion contracture of 4° or less as “no.” We Knee flexion contracture is easy to assess by physical exam- used the data from the leg with the least knee extension (most ination; however, there are limited reports on the prevalence of knee flexion contracture) for analysis. knee flexion contracture and distribution of functional mobility in larger cohorts of children with CP at all GMFCS levels Functional Mobility Scale (FMS) (Rodby-Bousquet and Hägglund 2010, Cloodt et al. 2018). We The FMS is a validated evaluation measure of functional studied knee flexion contracture, functional mobility, and their mobility in children with CP between 4 and 18 years of age association in children with CP. We assumed that the pres- (Graham et al. 2004). The purpose of FMS is to classify the ence and severity of knee flexion contracture contributes to individual’s present and predominant functional mobility at 3 specific distances, 5, 50, and 500 meters, corresponding to at decreased physical function in children with CP. home, in school, and out in the wider community, respectively. The FMS classification is reached through questions posed to the child and parents. It consists of a 6-level ordinal scale, Subjects and methods indicating the child’s level of mobility independence: FMS level 1 = uses wheelchair; 2 = uses a walker; 3 = uses This cross-sectional study was based on data from the Swedish national Cerebral Palsy register (www.cpup.se). More crutches; 4 = uses 1 or 2 canes; 5 = walks independently on than 95% of all children with CP in Sweden born from 2000 even surfaces; 6 = walks independently on all surfaces. For the purpose of this study, we grouped the FMS levels onwards are included in the CPUP register and participate in the follow-up program, which includes a physical examina- according to use of mobility aids and assistive devices: FMS tion and assessment of function by their local physiotherapist level 1, FMS level 2–4, and FMS level 5–6. The same groupyearly or every other year (Alriksson-Schmidt et al. 2017). All ing was used for all 3 distances, 5, 50, and 500 meters. children in Sweden with CP are enlisted in local “habilitation” centers, where specially trained physiotherapists perform the Standing and transfer ability physical examination including passive range of motion and Standing and transfer ability assessments are included in the use the same instructions to assess physical functional includ- CPUP follow-up program and mainly derive from the Intering the Functional Mobility Scale (FMS) and Standing and national Classification of Functioning, Disability and Health Transfer ability (Assessment form physiotherapist, www. (WHO). The physiotherapist asks how the child usually performs standing (categorized as without support, with support, cpup.se). We identified children born between 1999 and 2016 and or cannot stand) and transfer from sitting to standing and from retrieved data from their most recent assessments, 2017 and standing to sitting (categorized as without support, with sup2018. The inclusion criteria were children with CP and age port, or cannot transfer). range 4–18 years. The exclusion criteria were missing data in the register for sex, dominating neurological symptom, pas- Statistics sive range of motion of the knee joint, and mobility. After Multinomial regression was used to analyze the categorical exclusions, 2,838 remained for FMS analysis, 2,793 for trans- outcome FMS distances (5, 50, and 500 meters), with level fer analysis, and 2,815 for standing analysis (Figure 1). The 5–6 (walking independently) as the outcome reference. The degree, classification, and distribution of knee flexion con- independent variable was knee flexion contracture categorized

474

Acta Orthopaedica 2021; 92 (4): 472–478

Table 1. Distribution of age, sex and Gross Motor Function Classification System (GMFCS) level (n = 2,838) Age n

GMFCS level Male Female I II III IV V (58%) (42%) (50%) (17%) (8%) (11%) (14%)

4–6 580 332 248 323 86 35 58 78 7–9 601 360 241 283 120 46 65 87 10–12 696 414 282 367 121 56 69 83 13–15 583 338 245 266 96 51 84 86 16–18 378 200 178 169 64 42 48 55 Total 2,838 1,644 1,194 1,408 487 230 324 389

as: no (≤ 4°, mild (5–14°), severe (≥ 15°), and the analysis was adjusted for age and sex. The categorical outcome “standing” with category “standing without support” as reference and the outcome transfer “ability” with category “transferring without support” as reference was analyzed in the same way. The regression model is based on the assumption that the presence and severity of KFC contributes to decreased physical function. A multinomial regression gives odds ratio (OR) with 95% confidence interval (CI) as association measures. A p-value of less than 0.05 was regarded as statistically significant. All statistical analyses were performed with SPSS, version 22 (IBM Corp, Armonk, NY, USA). Ethics, data sharing, funding, and potential conflicts of interest The Research Ethics Committee at Lund University, Sweden, approved the study, and permission to extract data was obtained from the registry holder. The datasets used and analyzed during the current study are available from the corresponding author on reasonable request. This research did not receive any specific grant from funding agencies. The authors declare that they have no conflict of interest.

Results Demographics This study included 2,838 children between 4 and 18 years, mean 10 years 7 months (SD 3.9). The distribution of age, sex, and GMFCS levels is presented in Table 1. The dominating symptom was spastic CP, reported in 79%; dyskinesia in 11%; and ataxia in 5%. The remaining 5% had a mixed symptom pattern. Knee flexion contracture No knee flexion contracture (≤ 4°) was noted in 73% of the 2,838 children, mean 2° (SD 4°, range –35° to 4°). Mild knee flexion contracture was present in 14%, mean 7° (SD 3°, range 5°–13°), and 13% had severe knee flexion contracture, mean 27° (SD 12°, range 15°–95°).

Table 2. Distribution of knee flexion contracture (no, mild, severe) in relation to GMFCS levels. Values are count (%) Knee flexion contracture GMFCS No Mild Severe level ≤ 4° 5–14° ≥ 15° I 1,310 (93) 95 (6.7) 3 (0.3) II 402 (83) 74 (15) 11 (2) III 123 (54) 62 (27) 45 (20) IV 126 (39) 83 (26) 115 (35) V 113 (29) 89 (23) 187 (48)

Table 3. Distribution of knee flexion contracture (no, mild, severe) in relation to the Functional Mobility Scale (FMS) level 1 (using wheelchair), levels 2–4 (using assistive devices), and levels 5–6 (walking independently), and standing and transfer ability (cannot, with support, and without support). Values are count (%) Knee flexion contracture No Mild Severe Factor ≤ 4° 5–14° ≥ 15° FMS 5 meters FMS level 1 252 (12) 183 (46) 312 (86) FMS level 2–4 93 (5) 38 (9) 29 (8) FMS level 5–6 1,729 (83) 182 (45) 20 (6) FMS 50 meters FMS level 1 282 (14) 198 (49) 333 (92) FMS level 2–4 128 (6) 43 (11) 17 (5) FMS level 5–6 1,664 (80) 162 (40) 11 (3) FMS 500 meters FMS level 1 440 (21) 244 (61) 345 (96) FMS level 2–4 69 (3) 21 (5) 8 (2) FMS level 5–6 1,565 (76) 138 (34) 8 (2) Standing ability Cannot 18 (1) 18 (5) 60 (17) With support 282 (14) 191 (47) 274 (77) Without support 1,757 (85) 192 (48) 23 (6) Transfer ability Cannot 144 (7) 117 (29) 220 (62) With support 214 (11) 97 (25) 114 (33) Without support 1,681 (82) 183 (46) 15 (5)

In the age group 4–6 years, 13% had mild or severe knee flexion contracture (≥ 5°). The proportion rose to 20% in the age group 7–9 years, 27% in the age group 10–12, and 40% in each of the older age groups (13–15 and 16–18 years). Similarly, the proportion of knee flexion contracture rose with increased GMFCS level (Table 2). Functional Mobility Scale Lower FMS levels were noted with increased knee flexion contracture (Table 3, Figure 2). Standing and transfer ability The proportion of children with decreased standing and transfer ability rose with increased knee flexion contracture (Table 3, Figure 3).

Acta Orthopaedica 2021; 92 (4): 472–478

475

Standing, transfer mobility

Distribution of KFC (%) 100

Cannot stand

Stands with support

Stands without support Cannot transfer

Transfers with support

Transfers without support 0 FMS level 1

2–4 5–6

FMS 5 m No (≤4°) KFC

2–4 5–6

FMS 50 m

Mild (5–14°) KFC

1 2–4 5–6 Severe (≥15°) KFC

Figure 2. Distribution of Functional Mobility Scale (FMS) in no, mild, and severe knee flexion contracture (KFC). FMS level 1 (using wheelchair), 2–4 (using assistive devices), and 5–6 (walking independently), (n = 2,838).

Association between knee flexion contracture and FMS The odds ratios (OR) with 95% confidence interval (CI) are adjusted for age and sex. In children with mild knee flexion contracture, for both FMS 5 and 50 meters, we found an OR of 9 (CI 7–11) between FMS level 1 and level 5–6, while the OR was 4 (3–6) between FMS level 2–4 and level 5–6. For FMS 500 meters, the corresponding ORs were 8 (6–10) and 4 (2–6), respectively (Table 4). In children with severe knee flexion contracture, for FMS 5 meters, the OR was 170 (103–279) between FMS level 1 and level 5–6, and 32 (17–61) between FMS level 2–4 and level 5–6. For FMS 50 meters, the corresponding ORs were 266 (141–500) and 22 (10–49), respectively. For FMS 500 meters, the corresponding ORs were 217 (161–445) and 24 (9–66) (Table 4). Association between knee flexion contracture and standing ability In children with mild knee flexion contracture, we found an OR of 12 (6–23) between being unable to stand and standing without support, and an OR of 8 (6–10) between standing with support and standing without support. In children with severe knee flexion contracture, the corresponding ORs were 388 (186–806) and 113 (71–180), respectively. Association between knee flexion contracture and transfer ability Concerning the ability to transfer from sitting to standing and from standing to sitting, in children with mild knee flexion contracture we found an OR of 10 (7–13) between being unable to transfer and transferring without support, and an OR of 5 (4–6) between transferring with support and transferring without support. In children with severe knee flexion contracture, the corresponding ORs were 315 (176–565) and 77 (43–137), respectively.

100

Distribution of KFC (%)

FMS 500 m No (≤4°) KFC

Mild (5–14°) KFC

Severe (≥15°) KFC

Figure 3. Distribution of standing (n = 2,815) and transfer Ω (n = 2,793) ability in no, mild, and severe knee flexion contracture (KFC). Based on the International Classification of Functioning, Disability and Health (ICF). Standing refers to the child’s ability to maintain a basic body position when standing. Transfer mobility refers to the child’s ability to transfer from sitting to standing and from standing to sitting. Table 4. Multinomial regression with outcome Functional Mobility Scale level for distances of 5, 50, and 500 meters and independent variable knee flexion contracture (no, mild, and severe) Distance FMS level FMS level Knee flexion 1 versus 5–6 2–4 versus 5–6 contracture OR (95% CI) p-value OR (95% CI) p-value No (≤ 4°) Reference FMS 5 m Mild (5–14°) 9 (7–11) Severe (≥ 15°) 170 (103–279) FMS 50 m Mild (5–14°) 9 (7–11) Severe (≥ 15°) 266 (141–500) FMS 500 meters Mild (5–14°) 8 (6–10) Severe (≥ 15°) 217 (161–445)

Reference < 0.001 < 0.001

4 (3–6) < 0.001 32 (17–61) < 0.001

< 0.001 < 0.001

4 (3–5) < 0.001 22 (10–49) < 0.001

< 0.001 < 0.001

4 (2–6) < 0.001 24 (9–66) < 0.001

The analysis was adjusted for age and gender. FMS = Functional Mobility Scale, OR = odds ratio, CI = confidence interval.

Discussion In our study with 2,838 children at all GMFCS levels, 73% had no, 14% mild, and 13% severe knee flexion contracture. Functional mobility assessment revealed, when analyzing the total cohort, that around 2/3 were walking independently and 1/3 used a wheelchair. Most children could stand and transfer from sitting to standing without support. We found strong associations between both mild and severe knee flexion contracture and functional mobility, standing, and transfer ability. Knee flexion contracture There is no widely adopted definition of mild, moderate, and severe knee flexion contracture in CP. Miller (2005) defined

476

moderate knee flexion contracture as from 10–30° and severe knee flexion contracture as over 30°. Ma et al. (2006) reported 3 different treatment approaches depending on the severity of knee flexion contracture: 10–15° 15–25°, and over 25°. Cloodt et al. (2018) reported 22% knee flexion contracture of 5°or more, including all GMFCS levels, which is similar to the 27% found in our study; however, their study did not include any functional measures besides GMFCS. In line with previous studies, we noted that knee flexion contracture increases with age and GMFCS level (Hägglund and Wagner 2008, Nordmark et al. 2009, Cloodt 2018). Functional Mobility Scale Harvey et al. (2010) reported substantial agreement between FMS ratings and direct observation, demonstrating the validity of FMS as a measure of performance for children with CP. In post-surgery rehabilitation, the FMS has proven sensitive enough to detect both initial deterioration and ultimate improvement in mobility (Graham et al. 2004, Harvey et al. 2007). Around 2/3 of all children were high functioning, walking independently (FMS level 5–6) at the 3 distances 5, 50, and 500 meters. Around 1/3 used a wheelchair (FMS level 1) for the 3 distances. Only a small percentage of the children move with assistive devices (FMS level 2–4), which is similar to the results by Rodby-Bousquet and Hägglund (2012) in a study of 562 children. Even though many children with CP have the capacity to walk with assistive devices, it may not be how they actually move in daily life (Graham et al. 2004, Palisano et al. 2009, Wilson et al. 2014). Wilson et al. (2014) found that, even though walking capacity in a 1- and 5-minute walking test was associated with FMS level in children at GMFCS levels I to III, much of the variance found remained unexplained. Palisano et al. (2009) reported that young people with CP, although capable of walking in the community environment, chose mobility solutions that were faster and more efficient. The social aspect of being able to keep up with their peers could also be one reason for the finding that few children were in FMS level 2–4 in our study. It has also been reported that pain in children with CP was primarily explained by bone and joint deformities, including knee flexion contracture, and additionally associated with reduced mobility, as evaluated with the FMS (Schmidt et al. 2020). Another explanation could be economic factors: in Sweden, the assistive technology centers, at no cost to the family, provide assistive devices and aids such as wheelchairs. Our results are difficult to compare with the original study on FMS classification by Graham et al. (2004), since their study group consisted of 310 children from a tertiary referral center, in contrast to our populationbased group of 2,838 children. Standing and transfer ability Rodby-Bousquet and Hägglund (2010), in a study of 562 children from the CPUP register, reported similar results to ours on standing and transfer ability. They concluded that the

Acta Orthopaedica 2021; 92 (4): 472–478

GMFCS scale is useful for prediction of the individual child’s future sitting and standing performance. However, their study did not include any data on knee flexion contracture or other deformities in the lower limb. Knee flexion contracture and physical function Ma et al. (2006) noted improvements in knee flexion contracture, gait kinematics, and FMS level for 5 and 50 meters after hamstring lengthening and transfer; furthermore, they noted an association between knee flexion contracture and FMS, which is in line with our results. Taylor et al. (2016) showed that children with CP (mainly those at GMFCS levels I–III) and crouch gait who underwent single-event multilevel surgery (SEMLS), including posterior knee capsulotomy or distal femur extension osteotomy, improved in passive knee extension on physical examination and knee kinematics in gait. However, they observed no improvement in the functional measure GMFM D, despite long-term follow-up. This suggests that the FMS could be more sensitive for detecting changes than GMFM D. Furthermore Ammann-Reiffer et al. (2019) reported that a change of 1 FMS level is a clinically meaningful change in the rehabilitation of gait performance in children with motor disorders, though not exclusively CP. The Gillette Functional Assessment Questionnaire has proven sensitive and detected improved function after surgery (Novacheck et al. 2000), but we did not use it in our study. Harvey et al. (2007) pointed out that FMS level decreases initially after SEMLS and then improves 24 months postoperatively. Interestingly, Harvey et al. showed that the FMS was able to detect changes over time postoperatively for children at GMFCS level III, whereas the GMFCS remained stable. This is in line with the observation by Palisano et al. (1997) that GMFCS is stable over time and does not respond to change after interventions. Knee flexion contracture is often present in children with crouch gait, with increased knee flexion throughout stance phase, and not as common in jump gait pattern where increased knee flexion is mainly noted in early stance (Sutherland and Davids 1993, Rodda et al. 2004). Knee extensor mechanism insufficiency may add dynamic knee flexion in stance to the static knee flexion contracture found on physical examination. In the surgical treatment of flexed knee gait the correction of rotational malalignment with increased internal rotation of the femur and external tibial torsion, and foot deformity, should be included. The common deformity pes planovalgus with midfoot break results in an unstable foot with a short lever arm and deviation of foot progression. A stable plantigrade foot with forward progression is important for both gait and standing, providing a long lever arm for the ground reaction force to act on, to develop a sufficient plantar flexion knee extension couple moment, extending the knee (Miller 2005, Young et al. 2010). Young et al. (2010) stated that in children with CP, over time, knee flexion moments develop and increase the forces

Acta Orthopaedica 2021; 92 (4): 472–478

driving towards flexed knee gait and knee flexion contracture. In our study, we assumed that the severity of knee flexion contracture impacts physical function. In crouch gait, besides GMFCS level, severe knee flexion contracture plays a major role and surgical treatment with distal femoral extension osteotomies and patellar tendon advancement is often warranted (Stout et al. 2008). In contrast to mild knee flexion contracture in GMFCS level I and II, the role of knee flexion contracture is not as obvious (Ma et al. 2006, Young et al. 2010). The flexed knee gait seen in younger children is ideally treated before knee flexion contracture develops, and the timing and dose of treatment is an important factor (Miller 2005, Young et al. 2010). Our results, showing a clear association between both mild and severe knee flexion contracture and FMS, standing, and transfer ability, support the usefulness of these functional assessments for all GMFCS levels. It is not only the presence of KFC that affects physical function but also the severity. KFC is more frequently noted with higher GMFCS levels, which is also true regarding our main outcome variables, lower FMS levels, decreased standing, and transfer ability. Furthermore, the FMS has also been used to study how the care provided during childhood impacts CP in adulthood. Lennon et al. (2018) evaluated patient-reported functional mobility and life satisfaction in a cross-sectional study of young adults with CP. Interestingly, the FMS demonstrated stable functional mobility from childhood to adulthood. They also found that selfrecall of childhood functional mobility using the FMS correlated highly in individuals who had received treatment for flexed knee gait as a child. In our large cohort of children with CP, we found that even children with mild knee flexion contracture (5–14°) used a wheelchair for 46–61% of their daily mobility. To our knowledge, the association between knee flexion contracture and functional mobility has not been explored before. It is important to note that the FMS classification and the assessment of standing and transfer ability are scored on how the individual actually moves in daily life, not necessarily what he or she can achieve at peak performance. This could be of interest to follow into adulthood, as it may reflect the continuous support and help these individuals need to receive to maintain active functional mobility in the community. Limitations Several examiners (physiotherapists) with varying experience performed the measurement of knee flexion contracture. Measurement errors for goniometric assessments in children with CP have previously been reported, and no reliability study was performed in our study, which is a limitation (Stuberg et al. 1988). Nonetheless, the CPUP follow-up program strongly emphasizes practicing and learning how to measure passive range of motion with a goniometer in a standardized way (Nordmark et al. 2009). Furthermore, the variability of the measurements would most likely only influence the results

477

marginally in this large cohort, which was also discussed in a study by Nordmark et al. (2009). Another limitation could be that the FMS classification is determined by the child and parents and therefore might be both underestimated and overestimated; however, the FMS classification system has proven reliable (Graham et al. 2004). Conclusion Knee flexion contracture is common in children with CP and both mild and severe knee flexion contracture impacts functional mobility at all GMFCS levels. It is therefore important to prevent, detect, and treat knee flexion contracture. EP-C: The main investigator, responsible for the formulation of the research goals, design of the study, planning and coordinating the research activity. Performed the majority of the statistical analysis. Writing the initial draft and also reviewing and editing the article. Final approval of the version to be submitted. PW: Contributed to the conceptualization of the study and to reviewing and editing the article. Final approval of the version to be submitted. JR: Oversight and mentorship of the research activity. Formulation of the research goals and design of the study. Planning and coordinating the research activity. Contributed to the statistical analysis. Writing the initial draft and also reviewing and editing the article. Final approval of the version to be submitted. The authors thank the children and their families who participated in the CPUP. Acta thanks Michael Sussman for help with peer review of this study.

Alriksson-Schmidt A I, Arner M, Westbom L, Krumlinde-Sundholm L, Nordmark E, Rodby-Bousquet E, Hägglund G. A combined surveillance program and quality register improves management of childhood disability. Disabil Rehabil 2017; 39(8): 830-6. doi: 10.3109/09638288.2016.1161843. Ammann-Reiffer C, Bastiaenen C H G, Van Hedel H J A. Measuring change in gait performance of children with motor disorders: assessing the Functional Mobility Scale and the Gillette Functional Assessment Questionnaire walking scale. Dev Med Child Neurol 2019; 61(6): 717-24. doi: 10.1111/dmcn.14071. Bell K J, Ounpuu S, DeLuca P A, Romness M J. Natural progression of gait in children with cerebral palsy. J Pediatr Orthop 2002; 22(5): 677-82. Cloodt E, Rosenblad A, Rodby-Bousquet E. Demographic and modifiable factors associated with knee contracture in children with cerebral palsy. Dev Med Child Neurol 2018; 60(4): 391-6. doi: 10.1111/dmcn.13659. CPUP. Follow-up surveillance programme for people with cerebral palsy Available from: www.cpup.se. Daltroy L H, Liang M H, Fossel A H, Goldberg M J. The POSNA pediatric musculoskeletal functional health questionnaire: report on reliability, validity, and sensitivity to change. Pediatric Outcomes Instrument Development Group. Pediatric Orthopaedic Society of North America. J Pediatr Orthop 1998; 18(5): 561-71. doi: 10.1097/00004694-199809000-00001. Galey S A, Lerner Z F, Bulea T C, Zimbler S, Damiano D L. Effectiveness of surgical and non-surgical management of crouch gait in cerebral palsy: a systematic review. Gait Posture 2017; 54: 93-105. doi: 10.1016/j. gaitpost.2017.02.024. Graham H K, Harvey A, Rodda J, Nattrass G R, Pirpiris M. The Functional Mobility Scale (FMS). J Pediatr Orthop 2004; 24(5): 514-20. Hägglund G, Wagner P. Development of spasticity with age in a total population of children with cerebral palsy. BMC Musculoskelet Disord 2008; 9: 150. doi: 10.1186/1471-2474-9-150.

478

Hägglund G, Andersson S, Duppe H, Lauge-Pedersen H, Nordmark E, Westbom L. Prevention of dislocation of the hip in children with cerebral palsy: the first ten years of a population-based prevention programme. J Bone Joint Surg Br 2005; 87(1): 95-101. Harvey A, Graham H K, Morris M E, Baker R, Wolfe R. The Functional Mobility Scale: ability to detect change following single event multilevel surgery. Dev Med Child Neurol 2007; 49(8): 603-7. doi: 10.1111/j.14698749.2007.00603.x. Harvey A R, Morris M E, Graham H K, Wolfe R, Baker R. Reliability of the functional mobility scale for children with cerebral palsy. Phys Occup Ther Pediatr 2010; 30(2): 139-49. doi: 10.3109/01942630903454930. Lennon N, Church C, Miller F. Patient-reported mobility function and engagement in young adults with cerebral palsy: a cross-sectional sample. J Child Orthop 2018; 12(2): 197-203. doi: 10.1302/1863-2548.12.170127. Ma F Y, Selber P, Nattrass G R, Harvey A R, Wolfe R, Graham H K. Lengthening and transfer of hamstrings for a flexion deformity of the knee in children with bilateral cerebral palsy: technique and preliminary results. J Bone Joint Surg Br 2006; 88(2): 248-54. doi: 10.1302/0301-620x.88b2.16797. Miller F. Cerebral palsy [Electronic resource]. New York: Springer Science+Business Media; 2005. Nordmark E, Hägglund G, Lauge-Pedersen H, Wagner P, Westbom L. Development of lower limb range of motion from early childhood to adolescence in cerebral palsy: a population-based study. BMC Med 2009; 7: 65. doi: 10.1186/1741-7015-7-65. Novacheck T F, Stout J L, Tervo R. Reliability and validity of the Gillette Functional Assessment Questionnaire as an outcome measure in children with walking disabilities. J Pediatr Orthop 2000; 20(1): 75. Palisano R, Rosenbaum P, Walter S, Russell D, Wood E, Galuppi B. Development and reliability of a system to classify gross motor function in children with cerebral palsy. Dev Med Child Neurol 1997; 39(4): 214-23. Palisano R J, Shimmell L J, Stewart D, Lawless J J, Rosenbaum P L, Russell D J. Mobility experiences of adolescents with cerebral palsy. Phys Occup Ther Pediatr 2009; 29(2): 133-53. doi: 10.1080/01942630902784746. Rodby-Bousquet E, Hägglund G. Sitting and standing performance in a total population of children with cerebral palsy: a cross-sectional study. BMC Musculoskelet Disord 2010; 11: 131. doi: 10.1186/1471-2474-11-131. Rodby-Bousquet E, Hägglund G. Better walking performance in older children with cerebral palsy. Clin Orthop Relat Res 2012; 470(5): 1286-93. doi: 10.1007/s11999-011-1860-8.

Acta Orthopaedica 2021; 92 (4): 472–478

Rodda J M, Graham H K, Carson L, Galea M P, Wolfe R. Sagittal gait patterns in spastic diplegia. J Bone Joint Surg Br 2004; 86(2): 251-8. Rodda J M, Graham H K, Nattrass G R, Galea M P, Baker R, Wolfe R. Correction of severe crouch gait in patients with spastic diplegia with use of multilevel orthopaedic surgery. J Bone Joint Surg Am 2006; 88(12): 2653-64. doi: 10.2106/jbjs.e.00993. Russell D. Gross Motor Function Measure manual (GMFM). Hamilton, Ontario: McMaster University; 1993. Schmidt S M, Hägglund G, Alriksson-Schmidt A I. Bone and joint complications and reduced mobility are associated with pain in children with cerebral palsy. Acta Paediatr 2020; 109(3): 541-9. doi: 10.1111/apa. 15006. Steele K M, Damiano D L, Eek M N, Unger M, Delp S L. Characteristics associated with improved knee extension after strength training for individuals with cerebral palsy and crouch gait. J Pediatr Rehabil Med 2012; 5(2): 99-106. doi: 10.3233/prm-2012-0201. Stout J L, Gage J R, Schwartz M H, Novacheck T F. Distal femoral extension osteotomy and patellar tendon advancement to treat persistent crouch gait in cerebral palsy. J Bone Joint Surg Am 2008; 90(11): 2470-84. doi: 10.2106/jbjs.G.00327. Stuberg W A, Fuchs R H, Miedaner J A. Reliability of goniometric measurements of children with cerebral palsy. Dev Med Child Neurol 1988; 30(5): 657-66. doi: 10.1111/j.1469-8749.1988.tb04805.x. Sutherland D H, Davids J R. Common gait abnormalities of the knee in cerebral palsy. Clin Orthop Relat Res 1993; (288): 139-47. Taylor D, Connor J, Church C, Lennon N, Henley J, Niiler T, Miller F. The effectiveness of posterior knee capsulotomies and knee extension osteotomies in crouched gait in children with cerebral palsy. J Pediatr Orthop B 2016; 25(6): 543-50. doi: 10.1097/bpb.0000000000000370. Wilson N C, Mackey A H, Stott N S. How does the functional mobility scale relate to capacity-based measures of walking ability in children and youth with cerebral palsy? Phys Occup Ther Pediatr 2014; 34(2): 185-96. doi: 10.3109/01942638.2013.791917. WHO. World Health Organization. International Classification of Functioning, Disability and Health (ICF). Available from: https://www.who.int/classifications/icf/en/. Young J L, Rodda J, Selber P, Rutz E, Graham H K. Management of the knee in spastic diplegia: what is the dose? Orthop Clin North Am 2010; 41(4): 561-77. doi: 10.1016/j.ocl.2010.06.006.

Acta Orthopaedica 2021; 92 (4): 479–484

479

Pain, osteolysis, and periosteal reaction are associated with the STRYDE limb lengthening nail: a nationwide cross-sectional study Jan Duedal RÖLFING 1,2, Søren KOLD 3, Tobias NYGAARD 4, Mindaugas MIKUZIS 3, Michael BRIX 5, Christian FAERGEMANN 5, Martin GOTTLIEBSEN 1, Michael DAVIDSEN 1, Juozas PETRUSKEVICIUS 1, and Ulrik Kähler OLESEN 3 1 Orthopaedic

Reconstruction and Children’s Orthopaedics, Aarhus University Hospital, Aarhus; 2 Department of Clinical Medicine, Aarhus University, Aarhus; 3 Department of Orthopaedics, Interdisciplinary Orthopaedics, Aalborg University Hospital, Aalborg; 4 Department of Orthopaedics, Limb Lengthening and Bone Reconstruction Unit, Rigshospitalet, Copenhagen; 5 Department of Orthopaedics, Odense University Hospital, Odense, Denmark Correspondence: jan.roelfing@clin.au.dk Submitted 2021-02-17. Accepted 2021-03-08

Background and purpose — Observing serious adverse events during treatment with the Precice Stryde bone lengthening nail (NuVasive, San Diego, CA, USA), we conducted a nationwide cross-sectional study to report the prevalence of adverse events from all 30 bone segments in 27 patients treated in Denmark. Patients and methods — Radiographs of all bone segments were evaluated regarding radiographic changes in February 2021. We determined the number of bone segments with late onset of pain and/or radiographically confirmed osteolysis, periosteal reaction, or cortical hypertrophy in the junctional area of the nail. Results — In 30 bone segments of 27 patients we observed radiographic changes in 21/30 segments of 20/27 patients, i.e., 19/30 osteolysis, 12/30 periosteal reaction (most often multi-layered), and 12/30 cortical hypertrophy in the area of the junction between the telescoping nail parts. Late onset of pain was a prominent feature in 8 patients. This is likely to be a prodrome to the bony changes. Discoloration (potential corrosion) at the nail interface was observed in multiple removed nails. 15/30 nails were still at risk of developing complications, i.e., were not yet removed. Interpretation — All Stryde nails should be monitored at regular intervals until removal. Onset of pain at late stages of limb lengthening, i.e., consolidation of the regenerate, should warrant immediate radiographic examination regarding osteolysis, periosteal reaction, and cortical hypertrophy, which may be associated with discoloration (potential corrosion) of the nail. We recommend removal of Stryde implants as early as possible after consolidation of the regenerate.

Bone reconstruction and lengthening surgery entails many risks and unplanned surgeries are common (Frost et al. 2021, Morrison et al. 2020, Sheridan et al. 2020). However, since many adverse events can be managed with or without surgical intervention without affecting the long-term outcome, Paley (1990) redefined complications by subdividing these adverse events into problems, obstacles, and complications. Similarly, other groups suggest grading the severity of adverse events (class I–II–IIIA–IIIB) and dividing these into device and nondevice-related complications (Black et al. 2015, Frost et al. 2021). The introduction of an all internal Stryde bone lengthening nail (NuVasive, Specialized Orthopedics, San Diego, CA) in May 2018 was a game changer for bone-lengthening surgery because it enabled the majority of patient to fully weight-bear. Furthermore, the first publications showed promising clinical results with only few device-related complications and good biocompatibility without signs of corrosion (Robbins and Paley 2020, Iliadis et al. 2021). However, on February 4, 2021 the Danish Medicines Agency released an urgent field safety notice from NuVasive regarding Stryde and all PRECICE system devices. This notice came to prominence based on the British Medicines & Healthcare products Regulatory Agency (MHRA) identifying safety concerns. In the MHRA reference 2020/012/009/226/001 issued January 20, 2021 one concern that was raised was the “unknown long-term biological safety profile. This includes reports of pain and bony abnormalities at the interface between the telescoping nail segments.” We evaluated the prevalence of radiographic changes in terms of osteolysis, periosteal reactions, and cortical hypertrophy at the junction of the telescoping nail segments as well as late onset of pain and/or swelling in the area.

480

Acta Orthopaedica 2021; 92 (4): 479–484

Table 1. Population characteristics: median (range)

Patients and methods

Aarhus Aalborg Odense

University Rigs- University University We performed a cross-sectional Total Hospital hospitalet Hospital Hospital analysis of all bone-lengthening patients operated on with the Bone segments, n 30 12 7 9 2 Femur 19 7 6 4 2 Stryde implant at Aarhus Univer Tibia 11 5 1 5 0 sity Hospital, Aalborg University Patients, n 27 9 7 9 2 Hospital, Odense University HosPatient age 20 (11–65) 18 (11–65) 20 (15–41) 19 (16–42) 17; 46 Planned lengthening, mm 35 (15–80) 40 (15–80) 33 (25–65) 35 (20–80) 20; 35 pital, and Rigshospitalet Denmark. Evaluated radiograph after 3 patients presented with severe lengthening procedure, months 11 (2–23) 11 (6–20) 11 (2–20) 8 (3–23) 12; 20 adverse events at 3 of the centers Removed nails, n 15 7 2 5 1 routinely 8 3 1 3 1 on December 15, 2020, January 29, a b c due to complication or pain 7 4 1 2 0 2021, and February 1, 2021 (SupNot yet removed nail, n 15 5 5d 4 1 plementary data). These events and a 2 nails—severe pain and osteolysis and periosteal reaction, 1 nail—severe pain only, the concomitant field safety notice 1 nail—delayed healing and broken nail. led to the present study. New b 1 nail—pain and osteolysis. c radiographs of all implanted nails 1 nail—severe pain and osteolysis and periosteal reaction, 1 nail—valgus deformity corrected with trauma nail. were obtained in order to assess d 1 nail—delayed healing and broken nail, not yet removed. the prevalence of bony changes on standard anteroposterior and lateral radiographs. The treating physicians reported the clinical findings: pain and/or swelling in the junctional area. The radiographic changes were classi- Results fied based on a consensus decision of 3 authors evaluating the 30 bone segments (19 femurs, 11 tibias) of 27 patients were latest radiographs obtained in January/February 2021 or the lengthened using the fully weight-bearing Stryde nail in Denlatest radiographs before hardware removal (Figure 1, Supple- mark from the release of the Stryde in May 2018 until Janumentary data). ary 2021. According to the Danish distributor, no other Stryde Patients were identified by searching the electronic patient nails were implanted at other hospitals or private practices in records for NOMESCO codes for bone-lengthening surgery Denmark. We are thus able to present data on all 30 Danish of the lower limb (KNFK69 or KNGK69). The number of patients. The evaluated radiographs of all 30 nails and clinical patients at each center was double-checked with the number photos of removed nails are available as Supplementary data. of billings from the Danish distributor to either hospital. There Table 1 gives the median values of number of bone segwere no exclusion criteria. ments, number of patients, the age of the patients at the lengthening surgery, the planned bone lengthening, the observation Primary outcome measures time, and number of nails that have not yet been removed. • Number of bone segments with confirmed osteolysis and/or periosteal reaction and/or cortical hypertrophy in the junc- Primary outcome measures tional area in at least one radiographic projection. Evaluating the latest radiographs of 30 nails, we found 19/30 • Late onset of symptoms, i.e., pain and/or swelling in the osteolysis, 12/30 periosteal reaction, 12/30 cortical hypertrojunctional area with or without radiographic changes. phy (Supplementary data). 9/30 lengthened bone segments had no radiographic junctional changes, while 1 radiographic Secondary outcome measures abnormality was present in 4/30 and 17/30 had 2 or all 3 radioSymptoms warranting unplanned radiographic examination graphic signs. Periosteal reactions were most often of multiand further clinical and paraclinical investigations; number layered onion-skin type, indicative of rapid evolvement, and of patients still at risk, e.g., implant not yet removed; visual may thus resemble tumor or infection (Figure 2). inspection and photo documentation of the removed implant. Late onset of pain in the area was prevalent in 6 of these patients with verified radiographic changes. Furthermore, 2 Ethics, funding, and conflicts of interest patients complained of late onset of pain without any radioThe study was conducted according to the Declaration of Hel- graphic signs present. sinki and was approved by the local institutional review board Figure 1 illustrates osteolysis and periosteal reaction after as part of standard clinical care and service evaluation. No consolidation of the regenerate as well as discoloration (potenexternal funding was obtained. The authors declare no con- tial corrosion) after removal of a nail. This observation was flicts of interest. present in many of the removed nails (Supplementary data).

Acta Orthopaedica 2021; 92 (4): 479–484

481

Figure 1. A removed nail. Osteolysis, periosteal reaction, and discoloration (potential corrosion) are evident 315 days after index surgery. Please refer to the Supplementary data for other examples.

Figure 3. Discoloration at the telescoping interface of a removed Stryde nail.

Figure 2. Osteolysis and periosteal reaction at postoperative day 229 (48 days after the onset of symptoms, i.e., pain in the area). FDG PET CT revealed increased glucose uptake in the area. The removed nail (day 239) had discoloration (potential corrosion) and on MRI on the subsequent day cortical destruction and periosteal reaction of the right tibia as well as soft tissue involvement were present. In comparison, a Stryde nail without discoloration was simultaneously removed from the asymptomatic left side.

Figure 2 highlights the time-dependency of the observed adverse event with neither sign of osteolysis nor periosteal reaction 11 weeks before the fulminant appearance. Figure 2 also demonstrates that the radiographic signs can mimic infection or tumor. Cortical destruction and marked periosteal reaction, as well as soft-tissue involvement and swelling, are evi-

dent on the postoperative MRI findings of the right compared with the asymptomatic left tibia of the same patient, who had both Stryde nails (right side with discoloration, left side without discoloration) removed simultaneously. Importantly, none of the obtained biopsies were culture-positive. However, a biopsy from the medullary cavity revealed an inflammatory

482

reaction (acute/chronic) with foreign-body giant cells and metallic material. Similar findings were made in patient no. 17 (Supplementary data). Unrelated to the scope of the study, but importantly, 2/30 femoral Stryde nails broke within 1 year after implantation (nail diameter: Ø11.5, Ø11.5; patient weight: 55 kg, 85 kg).

Discussion In this cross-sectional study, the prevalence of either osteolysis, periosteal reaction, or cortical hypertrophy was 21/30 in the area of the telescoping interface of the Stryde limb lengthening nail. If the suspected prodrome—late onset of pain in the junctional area—is included, 23/30 of segments were affected. Despite these alarming numbers, the true incidence rate is likely to be higher, as 15/30 of the bone segments were still at risk, i.e., the nails were not removed at the time of writing. The 1st case with minor osteolytic lesions occurred in May 2020 and the radiographic changes were retrospectively identified following the presentation of the more pronounced cases and the Field Safety Notice from NuVasive Specialized Orthopedics in Denmark from February 4, 2021 (Danish Medicines Agency 2021). The 3 cases with more pronounced bony reactions presented at 3 different centers within 7 weeks from the middle of December 2020. Our findings are in line with the MHRA statement: “reports of pain and bony abnormalities at the interface between the telescoping nail segments”. In contrast, Robbins and Paley (2020), who were the first to implant the device in May 2018, state in their paper from 2020 evaluating 187 lengthened bone segments in 106 patients: “There were no issues related to biological incompatibility of the Biodur® 108 alloy stainless steel from which the implant was fabricated. There was no corrosion seen in the few nails that were removed during this short study time.” Biodur® 108 Alloy (ASTM F2229) is an alloy almost free from nickel and carbon, which theoretically should be stronger and more corrosion resistant compared with stainless steel and other alloys, thus allowing for full weight-bearing during the lengthening process (Robbins and Paley 2020). According to ASTM F2229 the composition of the alloy (% mass/mass) is manganese 21–24, chromium 19–23, molybdenum 0.5–1.5, nitrogen 0.85–1.1, silicon 0.75 max, carbon 0.08 max, nickel 0.05 max, phosphorus 0.03 max, copper 0.25 max, sulfur 0.01 max, and iron. We observed discoloration (potential corrosion) at the junction of the telescoping parts of the symptomatic patients (Figure 3, Supplementary data). Despite these observations the reason for the reported symptoms and radiographic changes remains speculative. Importantly, no nails were recharged or backed, which potentially could have worn a seal, etc. (Panagiotopoulou et al. 2018, Schiedel 2020). The adverse events predominantly occurred during late stages of bone lengthening, i.e., after (or during) maturation

Acta Orthopaedica 2021; 92 (4): 479–484

of the regenerate and after the structural integrity of the bone has been restored (Hvid et al. 2016). Micromotion and wear debris at the nail junction should therefore have been minimal when the adverse events were noticed. Whether full weightbearing during earlier stages may have set processes in motion causing the later onset of symptoms, or whether the alloy is not protecting against, but rather causing corrosion has yet to be determined. However, other explanations may also apply. The affected nails were or will be further analyzed, which may offer an explanation at a later stage. The marked response of the bone tissue with intramedullary osteolysis and periosteal reaction was in 6 cases preceded by pain, and 2 patient were suddenly unable to fully weightbear after solid consolidation of the regenerate. Whether the observed changes in the majority of cases are confined to the bony tissue or if the surrounding soft tissue also is affected as demonstrated in Figure 2 is unclear. Soft tissue biopsies and/ or MRI scans after removal of the nail may shed light on this. The time from onset of symptoms to radiographic changes ranged from 15 to 48 days. In particular, the manifestation of fulminant radiologic changes within 11 weeks between normal appearing bone to pronounced intra- and extramedullary bony changes is concerning (Figure 2). Pain and later swelling had been present for approximately 7 weeks prior to the radiographic changes. However, despite these symptoms the patient did not contact the treating physician before the scheduled clinical and radiographic control in the outpatient clinic. Notably, radiographic changes were also seen in patients with no or subclinical symptoms. Vogt et al. (2021) also describe osteolysis in Stryde nails. Late onset of pain was also the dominant clinical feature in their case series, which was relieved by hardware removal. Moreover, they observed discoloration of the nail at the telescoping interface. All our patients were within the weight-bearing limit of the applied nail (Robbins and Paley 2020). Nonetheless, 2 femoral Stryde nails broke within 1 year after implantation and causing reoperations in this series of 30 nails. Both patients were well within the weight limit of the Ø11.5 mm femoral nail, i.e., 55 kg and 85 kg. This adverse event is not considered to be related to the described osteolysis, but a consequence of regenerate insufficiency. Unrelated to these adverse events, the MHRA also raised concerns about a potentially “inappropriate use in children and adolescents” (British Orthopaedic Association 2021). The MHRA acknowledges a widespread use of PRECICE devices in this age group, but highlights that the nails have not been validated for use in these patient groups (Frommer et al. 2018, Nasto et al. 2020, Iobst 2020, Iliadis et al. 2021, Vogt et al. 2021). In our study, 10/27 patients were younger than 18 years old with the youngest patient being 11 years of age. Unlike the Stryde, other bone-lengthening nails such as the titanium PRECICE lengthening nail (introduced in May 2013 and improved to its latest version P2.2. in 2015) and

Acta Orthopaedica 2021; 92 (4): 479–484

the stainless-steel lengthening nail, Fitbone (Orthofix, Lewisville, TX, USA) have been on the market for many years. The growing body of literature regarding complications of all internal limb lengthening does not state any similar adverse events (Calder et al. 2019, Frost et al. 2021, Morrison et al. 2020, Thaller et al. 2020, Iliadias et al. 2021). However, the current MHRA recommendations not to implant the PRECICE P2.2 as well as scientific scrutiny demand retrospective studies evaluating whether this phenomenon/ adverse event also exists on a lower scale with these products. Moreover, the PRECICE Bone Transport nail (Kähler Olesen and Herzenberg 2020, Ferner et al. 2020, Abood et al. 2021) and the bone transport plate are made of the same Biodur 108 alloy, which could be one possible reason for the described phenomenon. Thus, these implants are more likely to be affected than the titanium PRECICE P2.2 (Calder et al. 2019, Kähler Olesen et al. 2019, Nasto et al. 2020) or stainless-steel Fitbone (Krieg et al. 2011, Accadbled et al. 2016, Horn et al. 2019). Our surveillance and pre-emptive treatment strategy of the Stryde implant 1. We recommend monitoring patients with Stryde implants closely (4–6 weekly) even, or especially, during late stages of the bone-lengthening process. 2. Patients should be informed about the adverse events and that we consider pain and potential swelling at the junctional area to be prodromes. 3. Hardware removal after the regenerate is sufficiently consolidated should not be delayed, even in asymptomatic patients without radiological changes. Exchange nailing before consolidation may be considered, if exchange nailing was part of the initial treatment plan or in cases with delayed healing or in symptomatic patients. 4. Intramedullary or transcortical bone and soft tissue biopsies may help to identify the cause and extent of this complication. 5. PET-CT may be useful in diagnosis and MRI after hardware removal may also be used in addition to biopsies to rule out resembling diagnosis, i.e., infection and cancer (Figure 2). 6. Blood/urine samples for metals were collected only sporadically in our patients. Similar to the lessons learned from metal-on-metal prostheses, a more structured surveillance program and a consensus on this program should be obtained. However, bone-lengthening nails are extraarticular devices and unlike metal-on-metal prostheses should be removed after bony consolidation. Thus, the risk of the described phenomenon and its potential long-term consequences, e.g., locally within the bone and soft tissue (potentially including pathological fractures) and systemically (metal allergy, deposition of metals in internal organs) may be alleviated by nail removal (Jakobsen et al. 2007, Langton et al. 2010, Hjorth et al. 2016).

483

In conclusion, osteolysis, periosteal reactions and/or cortical hypertrophy at the telescoping interface were present in 21/30 of the implanted Stryde nails. If the suspected prodrome—late onset of pain and/or swelling in the area—is included, 23/30 segments were affected. Despite these alarming numbers, the true incidence rate is likely to be higher, as 15/30 of the bone segments were still at risk, i.e., the nail was not yet removed. The company has recalled all Stryde implants; however, structured clinical surveillance programs of patients with implanted Stryde nails are warranted. Supplementary data Supplementary data is available in the online version of this article. The pdf file includes all 30 evaluated AP and lateral radiographs from January/February 2021 or the latest before nail removal. The consensus decision of 3 authors regarding the presence (+/-) of osteolysis, periosteal reaction, or cortical hypertrophy is given for each evaluated radiograph. Additionally, pathological evaluations of two biopsies are available, see online version of this article, http://dx.doi.org/10.1080/17453 674.2021.1903278

JDR+SK+TN+UKO: study design, data collection, critical review. and final approval of the manuscript. JDR+MM+UKO+TN: data curation and analysis. JDR: first draft of the manuscript. All authors: data collection and final approval of the manuscript. Besides the corresponding author, Søren Kold (sovk@rn.dk) and Ulrik Kähler Olesen (ulrik.kaehler@gmail.com) can also be contacted regarding this study. Thanks are offered to clinical photographer Tina Rasmussen, Department of Plastic Surgery and Burns Treatment, Rigshospitalet for detailed photo of nail discoloration. Pathological examination, illustration, and staining of biopsy (Supplementary data) was undertaken by Søren Daugaard, Rigshospitalet and Steen Bærentzen, Department of Pathology, Aarhus University Hospital, Denmark.

Abood A A-H, Petruskevicius J, Vogt B, Frommer A, Rödl R, Rölfing J D. The Joint Angle Tool for intraoperative assessment of coronal alignment of the lower limb. Strategies Tauma Limb Reconstr 2021; epub ahead of print. doi: 10.5005/jp-journals-10080-1511 Accadbled F, Pailhé R, Cavaignac E, Sales de Gauzy J. Bone lengthening using the Fitbone® motorized intramedullary nail: the first experience in France. Orthop Traumatol Surg Res 2016; 102: 217-22. doi: 10.1016/j. otsr.2015.10.011 Black S R, Kwon M S, Cherkashin A M, Samchukov M L, Birch J G, Jo C H. Lengthening in congenital femoral deficiency: a comparison of circular external fixation and a motorized intramedullary nail. J Bone Joint Surg Am 2015; 97: 1432-40. doi: 10.2106/JBJS.N.00932 British Orthopaedic Association. NuVasive PRECICE device recall by MHRA; 2021. Available at: https://www.boa.ac.uk/resources/nuvasiveprecice-device-recall-by-mhra.html (accessed February 14, 2021). Calder P R, McKay J E, Timms A J, Roskrow T, Fugazzotto S, Edel P, Goodier W D. Femoral lengthening using the Precice intramedullary limblengthening system. Bone Joint J 2019; 1001: 1168-76. 10.1302/0301620X.101B9.BJJ-2018-1271.R1

484

Danish Medicines Agency. NuVasive Specialized Orthopedics, Inc. tilbagetrækker Precice System-serien; 2021. Available at: https://laegemiddelstyrelsen.dk/da/udstyr/sikkerhedsmeddelelser/2021/02/-nuvasivespecialized-orthopedics,-inc-tilbagetraekker-precice-system-serien/ (accessed February 14, 2021). Ferner F, Lutter C, Dickschas J. Retrograde bone transport nail in a posttraumatic femoral bone defect. Unfallchirurg 2020; epub ahead of print. doi: 10.1007/s00113-020-00916-1 Frost M W, Rahbek O, Traerup J, Ceccotti A A, Kold S. Systematic review of complications with externally controlled motorized intramedullary bone lengthening nails (FITBONE and PRECICE) in 983 segments. Acta Orthop 2021; 92(1): 120-7. doi: 10.1080/17453674.2020.1835321 Frommer A, Rödl R, Gosheger G, Vogt B. Application of motorized intramedullary lengthening nails in skeletally immature patients: indications and limitations. Unfallchirurg 2018; 121: 860-7. doi: 10.1007/s00113018-0541-4 Horn J, Hvid I, Huhnstock S, Breen A B, Steen H. Limb lengthening and deformity correction with externally controlled motorized intramedullary nails: evaluation of 50 consecutive lengthenings. Acta Orthop 2019; 90: 81-7. doi: 10.1080/17453674.2018.1534321 Hjorth M H, Stilling M, Soballe K, Bolvig L H, Thyssen J P, Mechlenburg I, Jakbosen S S. No association between pseudotumors, high serum metalion levels and metal hypersensitivy in large-head metal-on-metal total hip arthroplasty at 5–7 year follow-up. Skeletal Radiol 2016; 45: 115-25. doi: 10.1007/s00256-015-2264-8 Hvid I, Horn J, Huhnstock S, Steen H. The biology of bone lengthening. J Child Orthop 2016; 10: 487-92. doi: 10.1007/s11832-016-0780-2 Iliadis A D, Palloni V, Wright j, Goodier D, Calder P. Pediatric lower limb lengthening using the PRECICE nail: our experience with 50 cases. J Pediatr Orthop 2021; 41: e44-e49. doi: 10.1097/BPO.0000000000001672 Iobst C A. Long bone lengthening in children. Tech Orthop 2020; 35: 189-94. doi: 10.1097/BTO.0000000000000463 Jakobsen S S, Danscher G, Stoltenberg M, Larsen A, Bruun J M, Mygind T, Kemp K, Soballe K. Cobalt-chromium-molybdenum alloy causes metal accumulation and metallothionein up-regulation in rat liver and kidney. Basic Clin Pharmacol Toxicol 2007; 101(6): 441-6. doi: 10.1111/j.17427843.2007.00137.x Krieg A H, Lenze U, Speth B M, Hasler C C. Intramedullary leg lengthening with a motorized nail. Acta Orthop 2011; 82: 344-50. doi: 10.3109/ 17453674.2011.584209 Kähler Olesen U, Herzenberg J E. Bone transport with internal devices. Techniques in Orthopaedics 2020; 35: 219-24. doi: 10.1097/BTO. 0000000000000474

Acta Orthopaedica 2021; 92 (4): 479–484

Kähler Olesen U, Nygaard T, Prince D E, Gardner M P, Singh U M, McNally M A, Green C J, Herzenberg J E. Plate-assisted bone segment transport with motorized lengthening nails and locking plates: a technique to treat femoral and tibial bone defects. J Am Acad Orthop Surg Glob Res Rev 2019; 3(8): e064. doi: 10.5435/JAAOSGlobal-D-19-00064 Langton D J, Jameson S S, Nargol A V F, Natu S, Joyce T J, Hallab N J. Early failure of metal-on-metal bearings in hip resurfacing and largediameter total hip replacement. Bone Joint Surg Br 2010; 92-B: 38-46. doi: 10.1302/0301-620X.92B1.22770 Morrison S G, Georgiadis A G, Huser A J, Dahl M T. Complications of limb lengthening with motorized intramedullary nails. J Am Acad Orthop Surg 2020; 28: e803-e809. doi: 10.5435/JAAOS-D-20-00064 Nasto L A, Coppa V, Riganti S, Ruzzini L, Manfrini M, Campanacci L, Palmacci O, Boero S. Clinical results and complication rates of lower limb lengthening in paediatric patients using the PRECICE 2 intramedullary magnetic nail: a multicenter study. J Pediatr Orthop B 2020; 29: 611-17. doi: 10.1097/BPB.0000000000000651 Paley D. Problems, obstacles, and complications of limb lengthening by the Ilizarov technique. Clin Orthop Relat Res 1990; (250): 81-104. doi: 10.1097/00003086-199001000-00011 Panagiotopoulou V C, Davda K, Hothi H S, Henckel J, Cerquiglini A, Goodier W D, Skinner J, Hart A, Calder P R. A retrieval analysis of the Precice intramedullary limb lengthening system. Bone Joint Res 2018; 7: 476-84. doi: 10.1302/2046-3758.77.BJR-2017-0359.R1 Robbins C, Paley D. Stryde weight-bearing internal lengthening nail. Techniques in Orthopaedics 2020; 35: 201-8. doi: 10.1097/BTO. 0000000000000475 Schiedel F. Extracorporal noninvasive acute retraction of STRYDE® for continued lengthening in cases with limited nail stroke: a technical less invasive solution to reload the STRYDE®. Arch Orthop Trauma Surg 2020. Epub ahead of print. doi: 10.1007/s00402-020-03484-6 Sheridan G A, Falk D P, Fragomen A, Rozbruch S R. Motorized internal limb-lengthening (MILL) techniques are superior to alternative limb-lengthening techniques: a systematic review and meta-analysis of the literature. JBJS Open Access 2020; 5: e20.00115. doi: 10.2106/JBJS.OA. 20.00115 Thaller P H, Frankenberg F, Degen N, Soo C, Wolf F, Euler E, Fürmetz J. Complications and effectiveness of intramedullary limb lengthening: a matched pair analysis of two different lengthening nails. Strategies Trauma Limb Recon 2020; 15: 7-12. doi: 10.5005/jp-journals-10080-1448 Vogt B, Rödl R, Gosheger G, Schulze M, Hasselmann J, Fuest C, Toporowski G, Laufer A, Frommer A. Focal osteolysis and corrosion at the junction of the Precice Stryde® intramedullary lengthening device – Preliminary clinical, radiographic and metallurgic analysis of 57 lengthened segments. (currently under review / personal correspondance).

Acta Orthopaedica 2021; 92 (4): 485–492

485

Complications common in motorized intramedullary bone transport for non-infected segmental defects: a retrospective review of 15 patients Mindaugas MIKUŽIS 1,2, Ole RAHBEK 1,2, Knud CHRISTENSEN 1,2, and Søren KOLD 1,2 1 Department of Orthopaedics, Aalborg University Hospital, Aalborg; 2 Interdisciplinary Orthopaedics, Aalborg University Hospital, Aalborg, Denmark Correspondence: mim@rn.dk Submitted 2020-12-23. Accepted 2021-03-09.

Background and purpose — Since the introduction of intramedullary bone transport nails only very few cases have been reported in the literature. Thus we evaluated the results and complications in a single institution retrospective cohort. Patients and methods — 15 (median age 40 years (1870), 8 males) consecutive patients, were included and the electronic patient records and radiographs were reviewed. Complications were severity graded and categorized as device or non-device related. Results — The segmental bone loss was due to non-union site in 8 femurs and 4 tibias, or traumatic bone loss in 2 femurs and 1 tibia. The segmental bone defect was a median of 3 cm (0.5–10). 9 of 10 femoral cases and 4 of 5 tibial cases healed with the bone transport nail. All 15 patients had a healed docking site and regenerate at the end of treatment after a median of 13 months (6–38). 24 complications (15 device related and 9 non-device related) occurred in 11/15 patients with a minimum follow-up of 6 months after nail removal. The number of unplanned surgeries due to device related complications was: 0 in 9 patients, 1 in 3 patients, 2 in 1 patient, 3 in 2 patients. Interpretation — Segmental bone defects can heal with a bone transport nail. However, the number of complications was high and 15 out of 24 complications were devicerelated. Optimizing nail design is therefore needed to reduce complications in intramedullary bone transport.

The concept of intramedullary bone transport nails to treat lower limb segmental bone defects was introduced by Baumgart et al. (1997) and refined by Kold and Christensen (2014) to alleviate the known complications seen in bone transport by external fixation (Paley and Maar 2000). The assumed advantages of using a fully implantable bone transport nail compared with external fixation is that early full joint motion is facilitated as skin and muscles are not transfixated, patient discomfort is reduced, pin site infections are eradicated, and the nail can be left in situ until the callus is sufficiently hardened. This potentially reduces the risk of fracture and secondary deformity as seen after removal of external fixators (Liu et al. 2020). However, only 5 cases of intramedullary bone transport nails have been reported (Baumgart et al. 1997, Kold and Christensen 2014, Accadbled et al. 2019), and a recent systematic review has showed high complication rates in bone lengthening despite the use of externally controlled motorized bone lengthening nails (Frost et al. 2020). Therefore, there is a need to investigate assumed advantages of the internal bone transport technique and observe if other complications are introduced by this new technique. We report our experience with the FITBONE bone transport nail in 15 patients with a minimum of 6 months’ follow-up after nail removal. We posed 2 questions: Are the bone transport nails capable of obtaining bone healing? Have new complications been introduced by the motorized transport nail?

Patients and methods Design and participants This is a single institution (Aalborg University Hospital, Denmark) retrospective case series with 15 patients (10 femur and 5 tibia) treated with the intramedullary bone transport FITBONE nail between 2012 and 2016. All bone transport © 2021 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group, on behalf of the Nordic Orthopedic Federation. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. DOI 10.1080/17453674.2021.1910777

486

Acta Orthopaedica 2021; 92 (4): 485–492

Figure 1. 36-year-old male (patient 3), treated for atrophic non-union in the femur. After resection of non-union and proximal osteotomy, the transport nail is inserted (A), and the distraction and bone transport is started for the bone transport phase (A, B, C). When bone transport is completed and bone ends at the resection site are docked (D), compression at the docking site and additional bone lengthening is started for the bone lengthening phase (D, E, F). The protrusion of the distal tip of the nail shows the amount of additional lengthening of the femur (F). At the end of the consolidation phase (G, H), the regenerate and docking site are healed (I).

nails were removed after the regenerate and docking site had fully consolidated in 3 out of 4 cortices. Follow-up after nail removal was median 46 months (6–89). Complications were extracted from patient records and scored according to Black et al. (2015) as categorized in Table 2. Complications were furthermore rated as device related or non-device related where device-related complications arise from properties of the implantable device itself (Lee et al. 2017). The latest radiographs after nail removal were used for measurement of alignment (mechanical axis deviation, MAD) and limb length discrepancy (LLD). The long standing radiographs were obtained in 13 out of 15 patients. In the 2 other patients the LLD was evaluated clinically, and the alignment evaluated on regular radiographs. The indication for bone transport with FITBONE nail was segmental bone loss, where it was judged safe to insert an intramedullary nail. Thus, the patients included in this study did not have soft tissue defects or preoperative clinical signs of infection. Bone biopsies were taken from the resection site for bacterial cultures during the nail insertion surgery. The segmental bone loss was due to resection of non-union site in 8 out of 10 femoral cases and 4 out of 5 tibial cases, or traumatic segmental bone loss in 2 femoral cases and 1 tibial case. In the investigated time period, the femoral bone transport was only performed at our institution by the reported intramedullary FITBONE nail. In contrast, the tibial cases represent selected cases as the majority of tibial bone trans-

ports were made with external frames. In this patient case series, joint fusions were not performed. At least 2 surgeries had been performed prior to the bone transport in 12 out of 15 patients. A post-study description of the non-unions made from the Calori non-union score (Calori et al. 2008) showed a median of 35 (8–40). 13 out of 15 patients had a Calori nonunion score above 25 indicating the need for bone transport (Abumunaser and Al-Sayyad 2011). 2 patients had a Calori score below 25 (patient no. 14: score of 8 and patient no. 15: score of 20). Treatment and surgery The intramedullary bone transport FITBONE nail (femur FSA, tibia TSA) was produced by Wittenstein intens GmbH (Igersheim, Germany). The FITBONE bone transport nail builds on the technology of the FDA-approved FITBONE lengthening nail. The FITBONE bone transport nail is CEmarked for the European market, but it is currently not FDA approved. The nail consists of a motorized lengthening device which has an up to 8 cm (depending on nail length) sliding slot in the middle part of the nail with a hole for locking screw (Figure 1A). The transporting bone segment is locked in the sliding hole between the osteotomy and resection site. Additional lengthening after completion of bone transport is obtained either by sliding locking screws (Figure 2) or protrusion of distal part of the nail as the total nail and bone segment length increase (Figure 1D–1F, Figure 3C). In tibial cases, the

Acta Orthopaedica 2021; 92 (4): 485–492

487

Figure 2. 62-year-old woman (patient 5) with distal femur non-union, unsuccessfully treated with locking plate and IM nail prior to the bone transport surgery, with 6 cm LLD and severe varus deformity (A). The non-union was resected and deformity corrected at the same time. The distal segment length was 4 cm (B). Bone transport is almost completed and the gap is bone grafted (C). After docking is achieved, the distraction is continued and sliding mechanism of the proximal part of the nail provide lengthening of the femur (D). at 12 months’ follow-up, the docking site and regenerate are healed (E). Latest follow-up shows corrected mechanical axis and 1 cm LLD (F).

Figure 3. 29-year-old female (patient 13) with 1 cm LLD and shortened fibula is treated with resection of nonunion (A). Proximal osteotomy for segmental bone transport was performed and 3 cm bone gap in the midshaft is closed, and the bone transport phase is completed (B). Prior to the lengthening of the tibia and fibula, a fibula osteotomy is performed (B), and proximal and distal tibio-fibular screws are inserted to protect the tibio-fibular joints. 1 cm lengthening of the tibia and the fibula is achieved (C). Follow-up after nail removal shows healed docking site and regenerate (D), LLD is corrected.

tibio-fibular joints were transfixated and a fibular osteotomy was perfomed prior to the lengthening phase of the tibia and fibula. The 8 cm maximum stroke of the nail can be distributed between the transport of the segment and bone segment lengthening. Surgery consisted of 3 steps: (1) resection of non-union or non-vital bone ends; (2) ventilating drill holes, bone canal reaming, and a percutaneous osteotomy for distraction osteogenesis; (3) insertion of nail and locking in both ends, and insertion of the sliding screw in the bone transport segment (Figure 1). Bone malalignment in any plane was corrected acutely at the time of nail insertion. The reverse planning method (Baumgart 2009) was used to maintain or to correct bone malalignment in the frontal plane (Figure 2). On the femur, the use of an antegrade or a retrograde approach depends on the location of the defect, presence of

488

Acta Orthopaedica 2021; 92 (4): 485–492

Table 1. Summary of the 15 patients treated with FITBONE bone transport nails Case no. Age Original pathology Nail approach Type of graft Femur 1 2 3 4 5 6 7 8 9 10 Tibia 11 12 13 14 15

Union at docking site

Healing of bone regenerate

Follow-up in months after nail removal

61 56 36 23 62 70 22 37 18 62

Non-union, oligotrophic Non-union, oligotrophic Non-union, oligotrophic Non-union, atrophic Non-union, atrophic Non-union, atrophic Segmental bone loss, open fracture Non-union, atrophic Segmental bone loss, open fracture Non-union, hypertrophic

Retrograde Retrograde Antegrade Retrograde Antegrade Antegrade Antegrade Retrograde Antegrade Antegrade

Autograft + Osigraft Autograft + Osigraft Autograft + Osigraft Autograft Autograft + Osigraft Autograft + Osigraft No graft Autograft + Osigraft No graft Autograft + Osigraft

No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

70 47 59 61 42 35 28 26 46 29

53 40 29 38 40

Non-union, oligotrophic Segmental bone loss, open fracture Non-union, atrophic Non-union, hypertrophic Non-union, hypertrophic

Antegrade Antegrade Antegrade Antegrade Antegrade

Osigraft Autograft Osigraft Autograft + Osigraft Autograft

Yes No Yes Yes Yes

89 58 46 6 27

deformity and the nail design. The segmental bone defect was median 3 cm (0.5–10). 13 cases were grafted with either autogenous bone graft, Osigraft (BMP-7) or both (Table 1). In the cases where grafting was postponed until docking of the transported segment, a percutaneous docking procedure was performed with removal of fibrous tissue prior to grafting. Aftercare The physiotherapy was started at day 1 after surgery with up to 20 kg weight-bearing during the distraction phase, and thereafter full weight-bearing was allowed. Bone distraction started after median 10 days (5–12) following surgery. The distraction speed was initially 0.33 mm 3 times per day, which was adjusted during the lengthening period depending on the quality of the bone regenerate. During the distraction phase the bone regenerate was radiographically followed every 1 or 2 weeks. After the end of distraction, bone healing of the regenerate and docking site was monitored monthly until full A B C consolidation. Ethics, funding, and potential conflicts of interest The study was approved by the institutional review board, registration ID number 2020-157. Each author certifies that he or she has no commercial associations or received funding that might pose a conflict of interest in connection with the submitted article.

Results Demographics 8 males and 7 females were included with the median age being 40 years (18–70). 4 patients had comorbidities such as diabetes, severe obesity, lupus, and rheumatoid arthritis.

Yes Yes Yes Yes Yes

Both smokers (5 patients) and non-smokers (10 patients) were included. Smokers were recommended to quit smoking prior to surgery, but no control was performed. Preoperative LLD was median 2 cm (0–6). Preoperative MAD was from 88 mm varus to 7 mm valgus (median 19 varus). In the femur group, 1 patient (patient 5) had a preoperative knee range of motion (ROM) from 0° to 40° and was subsequently treated with Judet’s quadricepsplasty (Ali et al. 2003) at the end of bone transport and lengthening. Bone healing (Table 1) 9 out of 10 femoral cases healed with the bone transport nail. The femoral failure occurred in a 61-year-old woman with impaired bone quality due to gastric bypass (patient 1, Table 1). The transport screw was inserted too close to the resection site, leaving only 7 mm of pulling bone stock proximal to the screw. As compression was applied over the docking site, the screw cut through the 7 mm bone stock resulting in loss of the achieved bone transport and thereby loss of bone contact at the docking site (Figure 4). The nail was changed to a regular trauma nail, and the femur healed with 3 cm shortening. LLD was later corrected by a standard FITBONE lengthening nail. 4 out of 5 tibial cases healed with the bone transport nail. The tibial case that did not heal was initially treated for acute bone loss after an open Gustilo IIIA fracture and developed signs of infection at the bone defect site during bone transport (patient 12, Table 1). Therefore, the nail was converted to an external circular frame after completion of the bone transport, and uneventful healing occurred hereafter. At the latest follow-up with median of 46 months (6–89) after nail removal all 15 patients had healed bone docking site and regenerate. 2 femoral cases did not receive any docking procedure as radiographic signs of good callus formation were present at

Acta Orthopaedica 2021; 92 (4): 485–492

489

Table 2. Complications graded by severity (I to IIIB) according to Black et al. 2015 and by origin (device and non-device related) according to Lee et al. 2017 Case no.

Device related complications, categories a II IIIA IIIB

Non-device related complications, categories a Unplanned II IIIA IIIB surgeries, n

Femur 1 2 receiver change 1 loss of transport, 4 due to infection, changed to trauma 1 screw discomfort nail 2 1 receiver removal 1 due to discomfort 3 0 4 0 5 1 nail re- 2 receiver change 1 fracture 3 bounding due to infection after removal 1 screw removal due to discomfort 6 1 screw backing out (changed during 1 docking surgery) 7 0 8 1 screw backing out 1 9 1 stiff knee 0 10 1 receiver change 1 nail stopped 1 additional 3 due to infection to lengthen grafting Tibia 11 1 screw backing out 1 syndesmotic (changed during screw removal docking surgery) due to discomfort 1 12 1 screw backing out 1 infection 1 changed 3 debridement to external fixation 13 0 14 1 nail re- 1 neurogenic 1 bounding pain, tarsal tunnel release 15 1 insertion of 1 forgotten syndes motic screw Complications in total (n = 24) 2, distraction 8, other 1, distraction 0 0 4 2 3 4, stability Unplanned surgeries in total 19 a Category

I complications required minimal intervention, and treatment goal was still achieved. Category II needed substantial change in treatment plan, such as unplanned return to operating room; the treatment goal was still achieved. Category IIIA complications failed to achieve treatment goal, but without developing new pathology or permanent sequelae. Category IIIB complication failed to achieve treatment goal and/ or new pathology or permanent sequelae developed.

the time of docking (Figure 5E). In the remaining cases, the docking site was grafted (Table 1). Complications (Table 2) 24 complications occurred in 11 out of 15 patients with a minimum follow-up of 6 months after nail removal. 2 complications led to minimal change in treatment plan (category I) in 2 patients. 16 complications led to substantial change in treatment plan (category II) in 9 patients. 3 complications resulted in failure to achieve treatment plan (category IIIA) in 3 patients. 1 complication (fracture after nail removal) resulted in new pathology (category IIIB) in 1 patient. 2 complications (reduced knee flexion and neurogenic foot pain) resulted in permanent sequelae at the end of treatment (cat-

egory IIIB) in 2 patients. 19 unplanned surgeries (11 device related and 8 non-device related) were needed in 10 out of 15 patients. Infections of the receiver occurred 5 times in 3 patients, rated as category II device-related complications. Infected receivers were changed and antibiotics administered based on biopsy for cultures. In 1 patient synovectomy was performed as the infection had spread via the connection cable from the receiver to a retrograde-inserted femoral nail into the knee joint. Other relevant findings The total distraction of the nail was median 4 cm (2–8), including the bone transport and additional lengthening (Table 3). Final LLD was 0 mm in 4 patients, < 10 mm in 10 patients and

490

Acta Orthopaedica 2021; 92 (4): 485–492

3 cm in 1 patient after bone transport nail removal. The patient with 3 cm LLD was a femoral case (patient 1, Table 3) with transport screw cut-out and after healing with a regular trauma nail. The LLD is reported prior to additional lengthening with a standard FITBONE lengthening nail. The bony deformity at the end of treatment was within 5 degrees in any plane (coronal, sagittal, and axial) in 14 out of 15 patients. 1 patient (patient 5, Table 3) had a sagittal plane deformity of 10°.

Discussion

Figure 4. Failure of femoral docking in a 61-year-old woman (patient 1). The position of the locking screw in the transport segment is 7 mm distally to the osteotomy site (A). The bone transport is almost completed, the compression of docking site and additional lengthening is started (B). The compression at the docking site failed (C) as the transport screw cut out and the transport segment lost the distraction.

Background and rationale To our knowledge, this study represents the largest case-series (Accadbled et al. 2019, Baumgart et al. 1997, Kold and Christensen 2014) of patients treated with a bone transport nail. The 15 patients had segmental bone defects due to either acute traumatic bone loss or bone resection of non-united fractures. Bone healing was achieved with the FITBONE bone transport nail in 9 out of 10 femoral cases and in 4 out of 5 tibial cases. In comparison, Accadbled et al. (2019) reported successful healing with the FITBONE bone transport nail in 3 out of 3 femoral segmental bone defects after tumor resection. No

Figure 5. 22-year-old patient (patient 7) with open fracture and segmental bone defect, primarily treated with external fixation (A). After resection of non-vital bone, the femur with 10 cm bone defect was stabilized with an IM nail (B). The trauma nail is changed to a bone transport nail and distraction is started (C). Bone transport at 2 cm (D), and completed at 8 cm due to early callus formation at the docking site (E). Follow-up after nail removal (F), the remaining varus of proximal tibia is not changed.

Acta Orthopaedica 2021; 92 (4): 485–492

491

Table 3. Data summary of 15 bone transport patients Case no.

Preoperative Preoperative Resection Bone Additional MAD, mm LLD, cm size, cm transport lengthening, cm

Femur 1 –7 2 21 3 0 4 12 5 88 6 33 7 17 8 27 9 0 10 30 Tibia 11 23 12 MPTA 89° 13 10 14 43 15 –4 Median (range)

1 2.5 3 0.5 7 3 0.5 0.5 4 2.5

3.5 2.5 2.5 3 2 2 10 4.5 4 4

2 0 1 1 1.5 2 (0–7)

3 8 3 2 0.5 3 (0–10)

2.5 2.5 2.5 3 2.5 2.5 8 5 1.5 4

–2 1.5 2.5 0 3.5 1 0 0.5 4 0.5

3 8 3 2 0.5 3.5 (1.5–8)

2 0 1 1 1.5 1 (–2 to 3.5)

Final LLD, cm

Postoperative MAD, mm

3 0 1 16 0.5 0 0.5 0 1 13 1.5 24 1.5 29 (deformity at tibia) 1 27 0 0 2 –5 0 9 2 MPTA 87° 0 10 0 MPTA 81° 0 7 1 (0–3)

Knee ROM 0–100 0–135 0–140 0–140 0–80 0–130 0–120 0–120 0–85 0–110 0–140 0–140 0–140 0–140 0–140

LLD = limb length discrepancy MAD = mechanical axis deviation MPTA = medial angle between the tibial mechanical axis and the proximal articular surface of the tibia in the coronal plane

tibial case series have been presented for bone transport nails, but in a recent systematic review of Ilizarov bone transport for treatment of tibial defects, the mean bone union rate was 90% (77–100) (Aktuglu et al. 2019). In a retrospective analysis of complications in 282 consecutive cases treated with Ilizarov external bone transport in the lower extremity, pin tract infections occurred in 66% of patients (Liu et al. 2020). In the majority, the pin tract infections were managed by daily pin site care and oral antibiotics; however, 20% of patients suffered deep pin tract infection or pin loosening and underwent treatment by pin replacement and intravenous antibiotics. Such pin-tract infections are avoided in our study by the use of a fully implantable bone transport nail. However, the complication rates were still high with this new treament, when patients were followed up to a minimum of 6 months after nail removal. 24 complications were observed in 11 out of 15 patients and 19 unplanned surgeries were performed in 10 out of 15 patients. Only 4 out of 15 patients did not sustain any complications, and 10 out of 15 patients had to undergo unplanned surgeries for complications. In 2 out of 15 patients the complication resulted in permanent sequelae. Lee et al. (2017) argued that, to fully understand the pros and cons of new bone-lengthening devices, analyses should divide complications related to the device itself from those that are not associated with the device. 15/24 complications were device related. Because of the novelty of the FITBONE transport nail this is expected and we believe that further development of bone transport nails could reduce these complications. As an example, 5 out of all 24 complications were related to infection of the subcutaneous receiver. This rate of infection seems high in light of our experience with FITBONE length-

ening nails and a recent systematic review of complications using lengthening nails (Frost et al. 2020). The high number of infections associated with the receiver in our study might be a result of recurrence of infection at the receiver site after exchange of the receiver in 2 patients. However, these infections will not appear if bone transport nails without such a receiver are used. Furthermore, 6/24 of complications that arose from backing of locking screws might be reduced by optimising screw design. However, the most severe complications, resulting in 2 permanent sequelae and 1 new pathology, were non-device related. Our current approach for treating segmental defects differs between the femur and the tibia. All femoral cases are treated by nails to avoid final treatment with external fixators. In cases of clinical infection or compromised soft tissues, extensive debridement is followed by temporal external fixation converted to an intramedullary nail within 2 weeks. Acute shortening is well tolerated on the femur, and segmental defects up to 4–6 cm are treated with a staged protocol. Acute shortening, autologous bone grafting, and standard intramedullary nailing allow for crucial early functional rehabilitation. When union has been obtained, the LLD is corrected at a second stage by standard intramedullary lengthening nail. Larger femoral defects of more than 4–6 cm are treated by femoral bone transport nails. Segmental defects of the tibia in the presence of clinical infection or compromised soft tissues are treated with external bone transport in a circular frame. At our institution, composite bone and soft-tissue loss of the leg are treated without free flaps (El-Rosasy and Ayoub 2020), and the indications for tibial bone transport nails might differ if immediate free flap coverage is provided of soft-tissue defects. We use a

492

tibial transport nail for segmental defects in cases of uncompromised soft tissues where stable fixation can be obtained with the nail. When presence of infection is suspected, workup with C-reactive protein level and PET-CT are performed. However, Moghaddam et al. (2015) found that 17% of nonunions, judged as aseptic, had positive intraoperative cultures. Therefore, we recommend thorough debridement when inserting bone transport nails, and in the case of unexpected positive cultures from resected bone, prolonged antibiotic treatment should be given (Kold and Christensen 2014). Non-union patients tend to suffer significant LLD and in our cases 7 out of 15 patients had preoperative LLD of more than 2 cm. One of the advantages with the FITBONE transport nail is the capability of additional lengthening when the bone transport phase is finished. The leg length might then be equalized within 1 surgery and the same nail unit. It is recommended by the company that the FITBONE transport nail is removed at the end of treatment, and it is mandatory that case-series should report on complications after recommended nail removal. We had a minimum of 6 months’ follow-up after nail removal. In a 70-year-old female a fracture occurred through the femoral regenerate 3 days after nail removal. This complication might have been avoided by exchanging the bone transport nail for a regular trauma nail, and this exchange technique was later performed in 3 patients. We remove all transport nails when the regenerate and the docking site have healed, and based on clinical judgement of refracture risk the need for exchange nailing is individualized. However, if explantation of nails was not needed, the need for secondary surgery and the risk of recurrent deformity and fracture might be lowered. Limitations The retrospective design of our study might lead to inaccurate reporting of complications. Furthermore, the tibial cases represent highly selected cases as most tibial bone transport cases treated at our institution in the same time period were performed with an external circular frame. In contrast, we have performed femoral bone transport with a nail only since the introduction of the femoral FITBONE transport nail at our department in 2012, as it is known that complication rates for external bone transport are higher for femoral than tibial transport (Liu et al. 2020). The lack of patient-reported outcome measures makes it impossible to conclude to what extent the bone healing and additional lengthening did improve patient quality of life. Therefore, prospective studies with stringent treatment algorithms and registration of patient-reported outcome measurements are needed. In conclusion, this retrospective case-series showed that segmental bone defects healed with a FITBONE bone transport nail in 13 of 15 cases. By introducing the motorized nail the number of device-related complications was high. Future research should focus on reducing device-related complications by optimizing nail design.

Acta Orthopaedica 2021; 92 (4): 485–492

Conceptualization: MM, OR, SK. Data extraction: MM, SK. Surgery: MM, KC, SK. Supervision: OR, SK. Writing original draft: MM. Writing, reviewing, and editing: MM, OR, KC, SK. Acta thanks Jan Duedal Rölfing and other anonymous reviewers for help with peer review of this study.

Abumunaser L A, Al-Sayyad M J. Evaluation of the Calori et al. nonunion scoring system in a retrospective case series. Orthopedics 2011; 34(5): 359. doi: 10.3928/01477447-20110317-31 Accadbled F, Thévenin Lemoine C, Poinsot E, Baron Trocellier T, Dauzere F, Sales de Gauzy J. Bone reconstruction after malignant tumour resection using a motorized lengthening intramedullary nail in adolescents: preliminary results. J Child Orthop 2019; 13(3): 324-329. doi: 10.1302/1863-2548.13.190016 Aktuglu K, Erol K, Vahabi A. Ilizarov bone transport and treatment of critical-sized tibial bone defects: a narrative review. J Orthop Traumatol 2019; 20(1). doi: 10.1186/s10195-019-0527-1 Ali A M, Villafuerte J, Hashmi M, Saleh M. Judet’s quadricepsplasty, surgical technique, and results in limb reconstruction. Clin Orthop Relat Res 2003; (415): 214-20. doi:10.1097/01.blo.0000093913.26658.9b Baumgart R. The reverse planning method for lengthening of the lower limb using a straight intramedullary nail with or without deformity correction: a new method. Oper Orthop Traumatol 2009; 21(2): 221-33. doi: 10.1007/ s00064-009-1709-4 Baumgart R, Betz A, Schweiberer L. A fully implantable motorized intramedullary nail for limb lengthening and bone transport. Clin Orthop Relat Res 1997; (343): 135-43. Black S R, Kwon M S, Cherkashin A M, Samchukov M L, Birch J G, Jo C H. Lengthening in congenital femoral deficiency: a comparison of circular external fixation and a motorized intramedullary nail. J Bone Joint Surg Am 2015; 97(17): 1432-40. doi: 10.2106/JBJS.N.00932 Calori G M, Phillips M, Jeetle S, Tagliabue L, Giannoudis P V. Classification of non-union: need for a new scoring system?. Injury 2008; 39(Suppl. 2): S59-S63 doi: 10.1016/S0020-1383(08)70016-0 El-Rosasy M A, Ayoub M A. Traumatic composite bone and soft tissue loss of the leg: region-specific classification and treatment algorithm. Injury 2020; 51(6): 1352-61. doi: 10.1016/j.injury.2020.03.041. Frost M W, Rahbek O, Traerup J, Ceccotti A A, Kold S. Systematic review of complications with externally controlled motorized intramedullary bone lengthening nails (FITBONE and PRECICE) in 983 segments. Acta Orthop 2020 Oct 27: 1-8. doi: 10.1080/17453674.2020.1835321. Epub ahead of print. Kold S, Christensen K S. Bone transport of the tibia with a motorized intramedullary lengthening nail: a case report. Acta Orthop 2014; 85(3): 333. Lee D H, Kim S, Lee J W, Park H, Kim T Y, Kim H W. A comparison of the device-related complications of intramedullary lengthening nails using a new classification system. Biomed Res Int 2017; 2017: 8032510. doi: 10.1155/2017/8032510 Liu Y, Yushan M, Liu Z, Liu J, Ma C, Yusufu A. Complications of bone transport technique using the Ilizarov method in the lower extremity: a retrospective analysis of 282 consecutive cases over 10 years. BMC Musculoskelet Disord 2020; 21: 354. doi: 10.1186/s12891-020-03335-w Moghaddam A, Zietzschmann S, Bruckner T, Schmidmaier G. Treatment of atrophic tibia non-unions according to ‘diamond concept’: results of one- and two-step treatment. Injury 2015; 46(Suppl. 4): S39-50. doi: 10.1016/S0020-1383(15)30017-6. Paley D, Maar D C. Ilizarov bone transport treatment for tibial defects. J Orthop Trauma 2000; 14(2): 76-85. doi: 10.1097/00005131-20000200000002.

Acta Orthopaedica 2021; 92 (4): 493–499

493

Pexidartinib improves physical functioning and stiffness in patients with tenosynovial giant cell tumor: results from the ENLIVEN randomized clinical trial Michiel VAN DE SANDE 1, William D TAP 2, Heather L GELHORN 3, Xin YE 4, Rebecca M SPECK 3, Emanuela PALMERINI 5, Silvia STACCHIOTTI 6, Jayesh DESAI 7, Andrew J WAGNER 8, Thierry ALCINDOR 9, Kristen GANJOO 10, Javier MARTÍN-BROTO 11, Qiang WANG 12, Dale SHUSTER 13, Hans GELDERBLOM 14, and John H HEALEY 15 1 Department of Orthopedics, Leiden University Medical Center, Leiden, the Netherlands; 2 Department of Medicine, Memorial Sloan Kettering Cancer Center and Weill Cornell Medical College, New York, NY, USA; 3 Department of Patient-Centered Research, Evidera, Bethesda, MD, USA; 4 Department of Global Health Economics and Outcomes Research, Daiichi Sankyo Inc, Basking Ridge, NJ, USA; 5 Department of Experimental, Diagnostic, and Specialty Medicine, IRCCS Istituto Ortopedico Rizzoli, Bologna, Italy; 6 Department of Cancer Medicine, Fondazione IRCCS Istituto Nazionale dei Tumori, Milan, Italy; 7 Department of Medical Oncology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia; 8 Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA; 9 Department of Medical Oncology, McGill University, Montreal, Quebec, Canada; 10 Department of Medical Oncology, Stanford Cancer Institute, Stanford, CA, USA; 11 Department of Medical Oncology, University Hospital Virgen del Rocio and Institute of Biomedicine of Sevilla (IBIS) (HUVR, CSIC, University of Sevilla), Seville, Spain; 12 Department of Biostatistics and Data Management, Daiichi Sankyo Inc, Basking Ridge, NJ, USA; 13 Department of Global Clinical Oncology Research and Development, Daiichi Sankyo Inc, Basking Ridge, NJ, USA; 14 Department of Medical Oncology, Leiden University Medical Center, Leiden, Netherlands; 15 Department of Orthopaedic Surgery, Memorial Sloan Kettering Cancer Center and Weill Cornell Medical College, New York, NY, USA Correspondence: majvandesande@lumc.nl Submitted 2020-08-15. Accepted 2021-01-21

Background and purpose — The ENLIVEN trial showed that, after 25 weeks, pexidartinib statistically significantly reduced tumor size more than placebo in patients with symptomatic, advanced tenosynovial giant cell tumor (TGCT) for whom surgery was not recommended. Here, we detail the effect of pexidartinib on patient-reported physical function and stiffness in ENLIVEN. Patients and methods — This was a planned analysis of patient-reported outcome data from ENLIVEN, a doubleblinded, randomized phase 3 trial of adults with symptomatic, advanced TGCT treated with pexidartinib or placebo. Physical function was assessed using the Patient-Reported Outcomes Measurement Information System (PROMIS)physical function (PF), and worst stiffness was assessed using a numerical rating scale (NRS). A mixed model for repeated measures was used to compare changes in PROMIS-PF and worst stiffness NRS scores from baseline to week 25 between treatment groups. Response rates for the PROMIS-PF and worst stiffness NRS at week 25 were calculated based on threshold estimates from reliable change index and anchor-based methods. Results — Between baseline and week 25, greater improvements in physical function and stiffness were experienced by patients receiving pexidartinib than patients receiving placebo (change in PROMIS-PF = 4.1 [95% confidence interval (CI) 1.8–6.3] vs. –0.9 [CI −3.0 to 1.2]; change in worst stiffness NRS = –2.5 [CI −3.0 to −1.9] vs.

–0.3 [CI −0.9 to 0.3]). Patients receiving pexidartinib had higher response rates than patients receiving placebo for meaningful improvements in physical function and stiffness. Improvements were sustained after 50 weeks of pexidartinib treatment. Interpretation — Pexidartinib treatment provided sustained, meaningful improvements in physical function and stiffness for patients with symptomatic, advanced TGCT. Tenosynovial giant cell tumor (TGCT) is a rare neoplasm that affects the joint synovia, bursae, and tendon sheaths (Gouin and Noailles 2017). Most patients experience pain, swelling, stiffness, instability, and reduced range of motion (Mastboom et al. 2018, Gelhorn et al. 2019). The current standard of care is surgical resection, but the tumors can be difficult to remove, frequently recur, and may require multiple surgeries or joint replacement (Ravi et al. 2011, van der Heijden et al. 2016). Therefore, non-surgical treatment options are needed. Pexidartinib is an inhibitor of the colony-stimulating factor1, KIT, and FLT3 receptor tyrosine kinases that was approved by the US Food and Drug Administration (FDA) for the treatment of adults with symptomatic TGCT associated with severe morbidity or functional limitations and not amenable to improvement with surgery (Lamb 2019). In ENLIVEN, a randomized, placebo-controlled phase 3 trial in 120 patients with symptomatic, advanced TGCT for whom surgery was not rec-

494

ommended, 39% of patients receiving pexidartinib and none of the patients receiving placebo at week 25 (p < 0.001) had a radiological response according to Response Evaluation Criteria in Solid Tumors, version 1.1 (RECIST 1.1). For patients receiving pexidartinib, 53% had a tumor response after about 2 years of treatment (Tap et al. 2019a). In addition to assessing radiographic tumor response, ENLIVEN included patient-reported outcome (PRO) measures. This included items from the Patient-Reported Outcomes Measurement Information System (PROMIS)-physical function (PF) items list and a worst stiffness numerical rating scale (NRS), which were developed specifically for patients with TGCT (Gelhorn et al. 2019). Here, we detail the effects of pexidartinib on physical function and stiffness in ENLIVEN as measured by PROMIS-PF and worst stiffness NRS. We also examine the relationship between changes in these 2 PROs and changes in tumor size.

Patients and methods Study design This was a planned analysis of PRO data from ENLIVEN (NCT02371369), a 2-part, double-blinded, multinational phase 3 trial conducted at 39 hospitals and centers in the US, Canada, Europe, and Australia (Tap et al. 2019a). In part 1 of ENLIVEN (May 2015 to September 2016), eligible patients were randomized in a 1:1 ratio to receive pexidartinib for 24 weeks (1,000 mg/day for 2 weeks, then 800 mg/day for 22 weeks) or placebo for 24 weeks. Patients who completed part 1 could enter part 2 (ongoing), in which all patients received open-label pexidartinib. Patients had to be ≥ 18 years of age, have histologically confirmed TGCT ≥ 2 cm as defined by RECIST 1.1 (Eisenhauer et al. 2009), be symptomatic, and have disease for which surgery could lead to worsening functional limitation or severe morbidity. Symptomatic disease was defined as a worst pain or worst stiffness score ≥ 4 on a scale of 0 (none) to 10 (worst imaginable) at any time during the week before the screening visit. The primary endpoint in ENLIVEN was tumor response at week 25 based on central review of MRIs using RECIST 1.1, wherein size is measured as the sum of the tumor diameters (Eisenhauer et al. 2009). Tumor size was also measured by the tumor volume score (TVS), which corresponds to the percentage of the volume of the maximally distended synovial cavity or tendon sheath involved (Tap et al. 2015). The secondary endpoints in ENLIVEN were comparative analyses at week 25 of the (1) mean change from baseline in the range of motion of the affected joint; (2) the proportion of responders based on centrally evaluated MRI scans and TVS; (3) mean change from baseline in PROMIS-PF; (4) mean change from baseline in worst stiffness NRS; (5) proportion of responders based on the Brief Pain Inventory worst pain NRS and analgesic use by the Brief Pain Inventory-30 definition; and

Acta Orthopaedica 2021; 92 (4): 493–499

(6) duration of response based on RECIST and TVS (Tap et al. 2019a). PROMIS-PF and worst stiffness NRS items were collected using an electronic handheld device. PROMIS-PF Items from the 121-item PROMIS-PF item bank were selected to measure physical functioning in ENLIVEN. Rigorous methods were used to develop and validate items in the PROMIS-PF item bank (Rose et al. 2008, Bruce et al. 2009, Hays et al. 2013). The PROMIS-PF instrument used in ENLIVEN has been described (Gelhorn et al. 2016, 2019). The items in PROMIS-PF quantitatively measure the impact of TGCT on physical functioning (mobility, dexterity, axial, and complex activity function). 2 tumor location-specific PROMIS-PF forms were used: a 13-item bank customized to assess lower limb function among patients with lower extremity tumors (Gelhorn et al. 2019); and an 11-item bank customized to assess upper limb function among patients with upper extremity tumors. 9 of the PROMIS-PF items were overlapping (i.e., included in both lower and upper extremity scales), resulting in a total of 15 unique items. Each item has 5 response options ranging from “unable to do” to “able to do without any difficulty.” PROMIS-PF scores are expressed as T-scores, which are standardized to a mean of 50 and a standard deviation of 10, wherein a higher score represents better physical function. Content validity of PROMIS-PF has been demonstrated for patients with TGCT (Gelhorn et al. 2019). Worst stiffness NRS Stiffness was evaluated using the worst stiffness NRS, a single-item self-administered questionnaire that assessed worst stiffness at site of the tumor in the last 24 hours (Gelhorn et al. 2016, 2019). The item has a response scale of 0 to 10, where 0 is “no stiffness” and 10 is “stiffness as bad as you can imagine.” A daily stiffness score was reported, and each patient’s weekly stiffness score was calculated as the average of completed records. A minimum of 4 out of 7 days of data was necessary to compute a weekly mean. Content validity of the worst stiffness NRS has been demonstrated for patients with TGCT (Gelhorn et al. 2019). Statistics Statistical analyses were performed in the intent-to-treat population using SAS version 9.4 (SAS Institute, Cary, NC, USA). Subjects were analyzed in the treatment group to which they were randomized. PROMIS-PF scores were derived using item-response theory parameters for each item (PROMIS 2020). Patternbased scores for the custom PROMIS short forms were estimated using PROC IRT in SAS version 9.4 (SAS Institute, Cary, NC, USA). As planned analyses prior to database lock and unblinding, changes in PROMIS-PF and worst stiffness NRS scores from

Acta Orthopaedica 2021; 92 (4): 493–499

baseline to week 25 were compared between treatment groups using a mixed model for repeated measures. The models specified the change in scores from baseline as the dependent variable and treatment group, time point, treatment group-by-time interaction, stratification factor of US sites versus non-US sites, the baseline value of the corresponding endpoint, and the baseline-by-time interaction as independent variables. Changes in PROMIS-PF and worst stiffness NRS scores from baseline to week 25 were analyzed using a hierarchical testing strategy to address multiplicity issues (Alosh et al. 2014, Tap et al. 2019a). A p-value of < 0.05 was considered statistically significant. Due to substantial missing post-baseline data, post hoc sensitivity analyses were performed to assess the robustness and consistency of the mixed-model repeated-measure analysis results, which included unconditional jump to reference (Carpenter et al. 2013) and delta adjustment tipping point (Mehrotra et al. 2017) analyses, both without the missing-at-random assumption. For the unconditional jump to reference analysis, missing data were imputed as follows: once a participant discontinued pexidartinib treatment, all of the attained treatment benefits were assumed to disappear, and the imputations were modelled as under placebo treatment. The delta adjustment tipping point analysis imputed missing data on the pexidartinib treatment arm by first imputing a value based on the missing-at-random assumption and then imposing a penalty of size delta to discontinued patients and a penalty of size half of delta to patients who completed part 1 of the study. The penalty was applied sequentially, and thus the 2nd missing value of a patient was assigned another delta penalty when the missing value was imputed. Missing data for patients on the placebo treatment arm were not assigned any penalty when being imputed or following missing-at-random assumption. The tipping point was determined as the point at which statistical significance was lost. The proportion of patients achieving meaningful withinpatient change thresholds for PROMIS-PF and worst stiffness NRS at week 25 between treatments was compared by Fisher’s exact test and displayed via empirical cumulative distribution function curves. If assessments were missing or sufficient data were not available to calculate the endpoint, patients were considered not to have achieved the threshold. The reliable change index (RCI), defined as 1.9*SQR (2)*SEM = 2.77*SEM, where SEM = standard error of measurement = SD (SQR (1 – reliability)) (Hays and Peipert 2018) and anchor-based methods (US Food and Drug Administration 2009) were used to estimate meaningful within-patient change thresholds. For PROMIS-PF in the TGCT population, the RCI yields a value of 6.84, and a previously published anchor-based estimate yielded a value of ≥ 3-point increase. For worst stiffness NRS in the TGCT population, the RCI value is 1.3, while the previously published anchor-based estimate is 1 (Tap et al. 2019b). For patients who continued treatment during the open-label extension (part 2) for at least 50 weeks, mean changes from

495

baseline in PROMIS-PF and worst stiffness NRS after 25 and 50 weeks of treatment were reported. Correlations between changes from baseline to week 25 in PRO scores (PROMIS-PF and worst stiffness NRS) and changes in tumor size (sum of diameters and TVS) were examined using Pearson’s correlation (r) for all evaluable patients. Estimates are presented with their 95% confidence intervals (CI). Ethics, funding, data sharing, and potential conflicts of interest ENLIVEN was approved by an institutional review board at each participating center and conducted in accordance with the Declaration of Helsinki and International Council on Harmonisation guidelines on Good Clinical Practice. All patients provided written informed consent. This study was supported by Daiichi Sankyo. De-identified individual participant data and supporting clinical study documents are available on request, depending on circumstances, at https://vivli.org. MvdS, EP, SS, JD, AJW, and HG received institutional grants from Daiichi Sankyo unrelated to the submitted work. WDT, SS, JD, AJW received personal fees from Daiichi Sankyo unrelated to the submitted work. WDT has patents pending on a companion diagnostic for CDK4 inhibitors and a treatment for metastatic sarcoma and is on scientific advisory boards for companies other than Daiichi Sankyo. MvdS, EP, SS, JD, AJW, TA, JMB, and HG received institutional grants from pharmaceutical companies other than Daiichi Sankyo unrelated to the submitted work. WDT, EP, SS, JD, AJW, TA, and JMB received personal fees from pharmaceutical companies other than Daiichi Sankyo unrelated to the submitted work. XY, QW, and DS are employees of Daiichi Sankyo. HLG and RMS are employees of Evidera, which received funding from Daiichi Sankyo. JHH is a paid consultant of Daiichi Sankyo.

Results Patients The analysis included 120 patients with symptomatic TGCT randomized and treated in ENLIVEN (59 to placebo, 61 to pexidartinib) (Figure 1, see Supplementary data). The patients had a mean age of 45 (SD 13) years, 59% were female, and 88% were white. Most TGCTs (92%) were in the lower extremities, most often in the knee (61%) and ankle (18%). Nearly all patients (98%) indicated that their tumor limited their physical function. 48 patients receiving placebo and 52 receiving pexidartinib completed part 1 of the study. In the open-label extension (part 2), 30 patients receiving placebo switched to pexidartinib and 48 patients who started on pexidartinib continued receiving it. Further details of patient characteristics have been published (Tap et al. 2019a).

496

Acta Orthopaedica 2021; 92 (4): 493–499

Table 1. Changes in PROMIS-PF and worst stiffness NRS between baseline and the end of part 1 (week 25) Outcome

Placebo Pexidartinib (n = 59) (n = 61)

p-value

PROMIS-PF score a Baseline, n 57 60 Week 25, n 32 39 Change between baseline and week 25, LS mean (95% CI) Mixed-effect model repeated measures model b −0.9 (–3.0 to 1.2) 4.1 (1.8–6.3) 0.002 Unconditional jump to reference model c −0.9 (–3.0 to 1.3) 3.5 (1.3–5.8) 0.005 Response at week 25 d,e, n (%) 3 (5) 18 (30) < 0.001 Worst stiffness NRS score Baseline, n 58 59 Week 25, n 36 34 Change between baseline and week 25, LS mean (95% CI) Mixed-effect model repeated measures model b −0.3 (–0.9 to 0.3) −2.5 (–3.0 to –1.9) < 0.001 Unconditional jump to reference model c −0.3 (–0.9 to 0.4) −2.2 (–2.8 to –1.6) < 0.001 e,f, Response at week 25 n (%) 11 (19) 24 (39) 0.02 Abbreviations: CI, confidence interval; LS, least squares; NRS, numerical rating scale; PROMIS-PF, Patient-Reported Outcomes Measurement Information System-physical function items. a Physical function was assessed daily using 15 items from the Patient-Reported Outcomes Measurement Information System (PROMIS)-physical function (PF) item bank, and a weekly average was calculated at baseline and at week 25. b Primary assessment. The model specified the change in scores from baseline as the dependent variable and treatment group, time point, treatment group-by-time interaction, stratification factor of US sites versus non-US sites, the baseline value of the corresponding endpoint, and the baseline-by-time interaction as independent variables. c Post hoc sensitivity analysis. Missing data were imputed by the missing not-at-random assumption. d Increase of ≥ 3 points from baseline. e Compared by 2-sided Fisher’s exact test. f Decrease of ≥ 1 point from baseline .

Effect of pexidartinib on physical function and stiffness at the end of part 1 (week 25) At week 25, patients receiving pexidartinib had greater improvements in PROMIS-PF and worst stiffness NRS than patients receiving placebo. The mean change in PROMIS-PF was 4.1 (CI 1.8–6.3) in patients receiving pexidartinib and −0.9 (CI −3.0 to 1.2) in patients receiving placebo, and the mean change in worst stiffness NRS was −2.5 (CI −3.0 to 1.9) in patients receiving pexidartinib and −0.3 (CI −0.9 to 0.3) in patients receiving placebo (Table 1). Week-25 PRO data were missing for as many as 46% of the patients in each treatment group due to discontinuations, patient non-compliance, and technical reasons (Figure 1, see Supplementary data). Therefore, to confirm the robustness and consistency of the findings, post hoc sensitivity analyses were conducted. Changes from baseline to week 25 remained similar using an unconditional jump to reference method: the mean change in PROMIS-PF was 3.5 (CI 1.3–5.8) in patients receiving pexidartinib and −0.9 (CI −3.0 to 1.3) in patients receiving placebo, and the mean change in worst stiffness NRS was −2.2 (CI −2.8 to −1.6) in patients receiving pexidartinib and −0.3 (CI −0.9 to 0.4) in patients receiving placebo (see Table 1). Tipping point analysis showed that the required delta for loss of statistical significance was −3.6 for PROMIS-PF and 2.5 for worst stiffness NRS, which are greater than the estimated treatment effects based on the observed data.

The empirical cumulative distribution function curves for PROMIS-PF and worst stiffness NRS depicting the proportion of participants in each treatment group reporting each level of change are shown in Figures 2 and 3 (see Supplementary data). These figures show large and consistent differences between the treatment and placebo groups. At both the RCI and anchor-based thresholds, at week 25, a greater proportion of patients receiving pexidartinib than placebo achieved meaningful change in physical function (RCI, n = 10/61 vs. 2/59, p = 0.02; anchor-based: n = 18/61 vs. 3/59, p < 0.001) and stiffness (RCI, n = 24/61 vs. 11/59, p = 0.01; anchor-based: n = 24/61 vs. 11/59, p = 0.02). Effect of pexidartinib on physical function and stiffness during the open-label extension (part 2) In patients who started on pexidartinib and continued to receive it open label during part 2 (n = 48), improvements were sustained in both PROMIS-PF (change from baseline = 3.6 [CI 2.0–5.2] at week 25 and 4.7 [CI 3.0–6.5] at week 50) and worst stiffness NRS (change from baseline = −2.7 [CI −3.4 to −1.9] at week 25 and −3.5 [CI −4.3 to −2.6] at week 50) (Table 2). Improvements were also sustained in patients who started on placebo and switched to pexidartinib (n = 30) as measured by both PROMIS-PF (change from baseline = 4.9 [CI 1.5–8.3] after 25 weeks on pexidartinib and 7.6 [CI 4.0–11.3] after 50 weeks on pexidartinib) and worst stiff-

Acta Orthopaedica 2021; 92 (4): 493–499

497

Table 2. Changes in PROMIS-PF and worst stiffness NRS during the open-label extension (part 2)

Discussion

This analysis of data from ENLIVEN confirmed that treatment with pexidartinib proAll patients duced sustained, meaningful improvements treated with pexidartinib in physical function and stiffness. These (n = 91) improvements corresponded with reductions in tumor size reported previously (Tap PROMIS-PF score a Change after 25 weeks et al. 2019a). The results also showed that on pexidartinib, n 16 38 54 the PROs, PROMIS-PF, and worst stiffness mean (95% CI) 4.9 (1.5 to 8.3) 3.6 (2.0 to 5.2) 4.0 (2.5 to 5.4) NRS can be used in TGCT clinical trials to Change after 50 weeks on pexidartinib, n 14 25 39 assess outcomes from the patient perspec mean (95% CI) 7.6 (4.0 to 11.3) 4.7 (3.0 to 6.5) 5.8 (4.1 to 7.5) tive. Worst stiffness NRS score PROMIS-PF and worst stiffness NRS Change after 25 weeks on pexidartinib, n 18 33 51 were included in ENLIVEN to confirm mean (95% CI) −3.0 (−4.5, to 1.5) −2.7 (−3.4 to −1.9) −2.8 (−3.5 to −2.1) that the decreases in tumor size reflect Change after 50 weeks changes in the tumors that are meaningful on pexidartinib, n 10 22 32 mean (95% CI) −2.2 (−4.2 to −0.2) −3.5 (−4.3 to −2.6) −3.1 (−3.9 to−2.3) to patients. These 2 PROs were adapted specifically for patients with TGCT through a Abbreviations: CI, confidence interval; NRS, numeric rating scale; PROMIS-PF, Patientprocess of a targeted literature review, cliniReported Outcomes Measurement Information System-physical function items. a Physical function was assessed daily using 15 items from the Patient-Reported Outcal expert interviews, and cognitive debriefcomes Measurement Information System (PROMIS)-physical function (PF) item bank, ing to confirm that the instructions, quesand a weekly average was calculated at baseline and at week 25. tions, and response options were relevant to and understood by patients. The content validity of the measures of these 2 PROs for ness NRS (change = −3.0 [CI −4.5–1.5] after 25 weeks on patients with TGCT has previously been described (Gelhorn pexidartinib and −2.2 [CI −4.2 to −0.2] after 50 weeks on et al. 2019). A previous psychometric study demonstrated pexidartinib). Overall, for patients included in the open-label that, for these patients, PROMIS-PF has acceptable internal extension, PROMIS-PF increased by 4.0 (CI 2.5–5.4) after 25 consistency reliability and that the 2 instruments have good weeks on pexidartinib and by 5.8 (CI 4.1–7.5) after 50 weeks, test–retest reliability, have adequate convergent validity with while worst stiffness NRS changed by −2.8 (CI −3.5 to −2.1) other PRO measures, can differentiate between known groups, after 25 weeks on pexidartinib and by −3.1 (CI −3.9 to −2.3) and can detect change over time (Speck et al. 2020). after 50 weeks. The improvements in PROMIS-PF and worst stiffness NRS correlated moderately with decreases in tumor size. This modCorrelation between changes in tumor size and PROs erate correlation was because some patients continued to expein part 1 rience reduced physical function or stiffness despite reduced Improvement from baseline in PROMIS-PF after 25 weeks tumor size. This may be due to the residual tumor affecting surof treatment correlated with a reduction of tumor size as rounding tissues or to continuing synovitis, underlying degenermeasured by RECIST 1.1 (r = −0.5, p = 0.0008; Figure 4A) ative joint disease, and possibly sequelae from earlier surgeries. and TVS (r = −0.3, p = 0.006; Figure 4B). Improvement from Improvements in PROMIS-PF were greater in patients who baseline in worst stiffness NRS after 25 weeks of treatment started on placebo and switched to pexidartinib than patients also correlated with a reduction of tumor size as measured by who started on pexidartinib and continued to receive it. This RECIST 1.1 (r = 0.5, p < 0.001; Figure 5A) and TVS (r = 0.4, finding may be due to the relatively small sample size in p < 0.001; Figure 5B). The correlation plots revealed that this analysis and individual differences in physical function for patients whose physical function or stiffness worsened, improvement. Because part 2 of ENLIVEN was open label, the extent of worsening was, in most cases, less in patients patients who switched over from placebo may have expected a receiving pexidartinib than in patients receiving placebo. “treatment effect,” which could have biased the results. A strength of this analysis is that the PROs were developed The plots also revealed that, although some patients receiving pexidartinib continued to have some worsening physi- specifically for patients with TGCT and, according to FDA cal function or stiffness, only a single patient had a small guidance, have acceptable psychometric properties (Gelhorn increase in the sum of diameters of target lesions (but no et al. 2019). Other PROs, including the 36-Item Short-Form increase in TVS). Health Survey, visual analogue scale for pain, and Western Outcome

Received placebo during part 1, switched to pexidartinib during part 2 (n = 30)

Received pexidartinib during parts 1 and 2 (n = 61)

498

Acta Orthopaedica 2021; 92 (4): 493–499

spoor et al. 2019). Thus, the current study provides valid, robust data Pexidartinib Pexidartinib Placebo Placebo on how patients with TGCT per40 40 ceive the effects of treatment with 30 30 pexidartinib. Another strength of 20 20 this analysis is that it relied on the 10 10 results of a randomized, placebocontrolled clinical trial, which pro0 0 vided robust, prospective data and –10 –10 controlled for patient bias. –20 –20 Nonetheless, some limitations should be considered when inter–30 –30 preting the results. Most of all, –40 –40 substantial post-baseline data were –50 –50 missing for PROMIS-PF and worst –100 –80 –60 –40 –20 0 20 40 60 –100 –80 –60 –40 –20 0 20 40 60 Change in tumor size (%) B Change in TVS (%) A stiffness NRS due to discontinuations, patient non-compliance, Figure 4. Correlation between change in physical function and change in tumor size (sum of diameters of target lesions) and tumor volume score (TVS) in adults with symptomatic, advanced tenosynovial and technical issues with the elecgiant cell tumor treated with pexidartinib or placebo in ENLIVEN. tronic data-collection device. This In part 1 of ENLIVEN, adult patients with symptomatic, advanced tenosynovial giant cell tumor were included 8 patients who disconrandomized to treatment with pexidartinib (1,000 mg/day for 2 weeks, then 800 mg/day for 22 weeks) or placebo for 24 weeks. Physical function was assessed daily using 15 items from the Patient-Reported tinued pexidartinib due to hepatic Outcomes Measurement Information System (PROMIS)-physical function (PF) item bank, and a weekly adverse events, including 4 cases average was calculated at baseline and at week 25. The figure shows that improvement between baseof mixed or cholestatic hepatoline and week 25 in PROMIS-PF correlated with the reduction in tumor size during the same period when measured by either RECIST 1.1 (Pearson’s r = −0.49, p = 0.0008; panel A) or tumor volume score toxicity, which led to the US FDA (TVS; Pearson’s r = −0.34, p = 0.006; panel B). establishing a risk evaluation and mitigation strategy (REMS) program. Sensitivity analyses Change in worst stiffness NRS (%) Change in worst stiffness NRS (%) conducted to address the poten100 100 Pexidartinib Pexidartinib tial of informative missing data Placebo Placebo 80 80 confirmed that the differences in 60 60 scores between placebo and pexidartinib were statistically signifi40 40 cant, although estimates of treat20 20 ment effect size could have been 0 0 affected. In addition, although a –20 –20 conservative approach was taken to handling missing data by assuming –40 –40 that all patients with missing data –60 –60 were non-responders, response –80 –80 rates for placebo and pexidartinib remained statistically significantly –100 –100 –100 –80 –60 –40 –20 0 20 40 60 –100 –80 –60 –40 –20 0 20 40 60 different. Another potential limitaChange in TVS (%) A Change in tumor size (%) B tion of this study is that follow-up Figure 5. Correlation between change in worst stiffness and change in tumor size (sum of diameters of data on symptoms for patients who target lesions) and tumor volume score (TVS) in adults with symptomatic, advanced tenosynovial giant cell tumor treated with pexidartinib or placebo in ENLIVEN. discontinued the study were not In part 1 of ENLIVEN, adult patients with symptomatic, advanced tenosynovial giant cell tumor were collected, so we could not deterrandomized to treatment with pexidartinib (1,000 mg/day for 2 weeks, then 800 mg/day for 22 weeks) or mine the duration of benefits of placebo for 24 weeks. Worst stiffness was assessed daily using a numerical rating scale (NRS), and a weekly average was calculated. The figure shows that improvement between baseline and week 25 in pexidartinib after treatment was worst stiffness NRS correlated with the reduction in tumor size during the same period when measured discontinued. Finally, we did not by either RECIST 1.1 (Pearson’s r = 0.53, p = 0.0004; panel A) or tumor volume score (TVS; Pearson’s determine whether the results difr = 0.43, p = 0.0003; panel B). fered according to tumor location. Ontario and McMaster Universities Osteoarthritis Index, have Nonetheless, the results support the conclusion that pexidarbeen used to examine the effect of surgery on quality of life tinib durably improves physical function and stiffness from and joint function, but none of them are specific to TGCT (Ver- the patient perspective. Change in PROMIS–PF (%)

Change in PROMIS–PF (%)

Acta Orthopaedica 2021; 92 (4): 493–499

In conclusion, this analysis demonstrated the benefit of pexidartinib in patients with symptomatic TGCT for whom surgery, if possible, would be associated with severe morbidity or functional limitation. This benefit must be carefully balanced against the risk of severe liver and other toxicities. Supplementary data Figures 1–3 are available as supplementary data in the online version of this article, http://dx.doi.org/10.1080/17453674.2021. 1922161

All authors approved the final manuscript and participated in writing, editing, or providing comments on the manuscript. MvdS, WDT, HG, and JHH participated in study design and data collection; HLG, XY, QW, RMS, and DS participated in study design; and EP, SS, JD, AJW, TA, KG, and JMB participated in data collection. All authors met the ICMJE criteria for authorship. The study was funded by Daiichi Sankyo. JHH and WDT also received funding from the National Institutes of Health/National Cancer Institute (#P30-CA008748). Medical writing was provided by Phillip Leventhal, PhD and Julia Zolotarjova, MSc, MWC of Evidera and funded by Daiichi Sankyo. Acta thanks Mikael Eriksson and Ron D Hays for help with peer review of this study.

Alosh M, Bretz F, Huque M. Advanced multiplicity adjustment methods in clinical trials. Stat Med 2014; 33(4): 693-713. Bruce B, Fries J F, Ambrosini D, Lingala B, Gandek B, Rose M, Ware J E. Better assessment of physical function: item improvement is neglected but essential. Arthritis Res Ther 2009; 11(6): R191. Carpenter J R, Roger J H, Kenward M G. Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation. J Biopharm Stat 2013; 23(6): 1352-71. Eisenhauer E A, Therasse P, Bogaerts J, Schwartz L H, Sargent D, Ford R, Dancey J, Arbuck S, Gwyther S, Mooney M, Rubinstein L, Shankar L, Dodd L, Kaplan R, Lacombe D, Verweij J. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer 2009; 45(2): 228-47. Gelhorn H L, Tong S, McQuarrie K, Vernon C, Hanlon J, Maclaine G, Lenderking W, Ye X, Speck R M, Lackman R D, Bukata S V, Healey J H, Keedy V L, Anthony S P, Wagner A J, Von Hoff D D, Singh A S, Becerra C R, Hsu H H, Lin P S, Tap W D. Patient-reported symptoms of tenosynovial giant cell tumors. Clin Ther 2016; 38(4): 778-93. Gelhorn H L, Ye X, Speck R M, Tong S, Healey J H, Bukata S V, Lackman R D, Murray L, Maclaine G, Lenderking W R, Hsu H H, Lin P S, Tap W D. The measurement of physical functioning among patients with tenosynovial giant cell tumor (TGCT) using the patient-reported outcomes measurement information system (PROMIS). J Patient Rep Outcomes 2019; 3(1): 6. Gouin F, Noailles T. Localized and diffuse forms of tenosynovial giant cell tumor (formerly giant cell tumor of the tendon sheath and pigmented villonodular synovitis). Orthop Traumatol Surg Res 2017; 103(1S): S91-7.

499

Hays R D, Peipert J D. Minimally important differences do not identify responders to treatment. JOJ scin 2018; 1(1): 555552. Hays R D, Spritzer K L, Amtmann D, Lai J S, Dewitt E M, Rothrock N, Dewalt D A, Riley W T, Fries J F, Krishnan E. Upper-extremity and mobility subdomains from the Patient-Reported Outcomes Measurement Information System (PROMIS) adult physical functioning item bank. Arch Phys Med Rehabil 2013; 94(11): 2291-6. Lamb Y N. Pexidartinib: first approval. Drugs 2019; 79(16): 1805-12. Mastboom M J, Planje R, van de Sande M A. The patient perspective on the impact of tenosynovial giant cell tumors on daily living: crowdsourcing study on physical function and quality of life. Interact J Med Res 2018; 7(1): e4. Mehrotra D V, Liu F, Permutt T. Missing data in clinical trials: controlbased mean imputation and sensitivity analysis. Pharm Stat 2017; 16(5): 378-92. PROMIS. HealthMeasures; 2020. Available from: https://www.healthmeasures.net/explore-measurement-systems/promis. Accessed October 23, 2020. Ravi V, Wang W L, Lewis V O. Treatment of tenosynovial giant cell tumor and pigmented villonodular synovitis. Curr Opin Oncol 2011; 23(4): 361-6. Rose M, Bjorner J B, Becker J, Fries J F, Ware J E. Evaluation of a preliminary physical function item bank supported the expected advantages of the Patient-Reported Outcomes Measurement Information System (PROMIS). J Clin Epidemiol 2008; 61(1): 17-33. Speck R M, Ye X, Bernthal N M, Gelhorn H L. Psychometric properties of a custom patient-reported outcomes measurement information system (PROMIS) physical function short form and worst stiffness numeric rating scale in tenosynovial giant cell tumors. J Patient Rep Outcomes 2020; 4(1): 61. Tap W D, Wainberg Z A, Anthony S P, Ibrahim P N, Zhang C, Healey J H, Chmielowski B, Staddon A P, Cohn A L, Shapiro G I, Keedy V L, Singh A S, Puzanov I, Kwak E L, Wagner A J, Von Hoff D D, Weiss G J, Ramanathan R K, Zhang J, Habets G, Zhang Y, Burton E A, Visor G, Sanftner L, Severson P, Nguyen H, Kim M J, Marimuthu A, Tsang G, Shellooe R, Gee C, West B L, Hirth P, Nolop K, van de Rijn M, Hsu H H, Peterfy C, Lin P S, Tong-Starksen S, Bollag G. Structure-guided blockade of CSF1R kinase in tenosynovial giant-cell tumor. N Engl J Med 2015; 373(5): 428-37. Tap W D, Gelderblom H, Palmerini E, Desai J, Bauer S, Blay J Y, Alcindor T, Ganjoo K, Martin-Broto J, Ryan C W, Thomas D M, Peterfy C, Healey J H, van de Sande M, Gelhorn H L, Shuster D E, Wang Q, Yver A, Hsu H H, Lin P S, Tong-Starksen S, Stacchiotti S, Wagner A J. Pexidartinib versus placebo for advanced tenosynovial giant cell tumour (ENLIVEN): a randomised phase 3 trial. Lancet 2019a; 394(10197): 478-87. Tap W D, Speck R M, Ye X, Palmerini E, Stacchiotti S, Desai J, Wagner A J, Alcindor T, Ganjoo K N, Broto J M, Wang Q, Shuster E, Gelhorn H, Gelderblom H. Responder analysis of patient-reported outcomes measurement information system (PROMIS) physical function (PF) and worst stiffness among patients with tenosynovial giant cell tumors (TGCT) in the ENLIVEN study. J Clin Oncol 2019b; 37(15 Suppl.): e18236. US Food and Drug Administration. Guidance for industry on patientreported outcome measures: use in medical product development to support labeling claims; 2009. Available from: https://www.fda.gov/media/77832/ download. Accessed January 7, 2021. van der Heijden L, Piner S R, van de Sande M A. Pigmented villonodular synovitis: a crowdsourcing study of two hundred and seventy two patients. Int Orthop 2016; 40(12): 2459-68. Verspoor F G M, Mastboom M J L, Hannink G, van der Graaf W T A, van de Sande M A J, Schreuder H W B. The effect of surgery in tenosynovial giant cell tumours as measured by patient-reported outcomes on quality of life and joint function. Bone Joint J 2019; 101-B(3): 272-80.

500

Acta Orthopaedica 2021; 92 (4): 500

Erratum

of Orthopedic Surgery, Örebro University Hospital; 2 Department of Orthopedics, Örebro University; 3 Department of Orthopaedics, Skaraborg Hospital, Skövde, Sweden Correspondence: evelina.pantzar@regionorebrolan.se Submitted 2020-10-23. Accepted 2021-03-21. DOI 10.1080/17453674.2021.1912941 Published online ahead of print

Erronous affiliations were published in the online version. The correct affiliations are stated below. Evelina Hanna Sofia PANTZAR-CASTILLA 1, Per WRETENBERG 1, and Jacques RIAD 1,2,3

1 Department of Orthopedic Sugery, Örebro University Hospital; 2 Department of Orthopedics, Institute of Clinical Science, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden; 3 Department of Orthopaedics, Skaraborg Hospital, Skövde, Sweden Correspondence: evelina.pantzar@regionorebrolan.se Submitted 2020-10-23. Accepted 2021-03-21.

4/21 ACTA ORTHOPAEDICA

Medical

IMPROVE THE CHANCES

REDUCE RISK FOR INFECTION Reduction of infection risk* using dual antibiotic-loaded bone cement in high risk patients

in aseptic revision TKA * as reported in study results

10453

www.heraeus-medical.com

in fractured neck of femur

Vol. 92, No. 4, 2021 (pp. 371–500)

34 % 69 % 57 %

in primary hip & knee arthroplasty

Volume 92, Number 4, August 2021

COVER.indd 1

02-07-2021 16:46:37