FullvendorsupportIndirect,butcomprehensivesupport,byvendorVendorsupport,butnot(yet)entirelycomprehensiveComprehensivesupport,butnotbyvendorLimited,probablyindirectsupport–butatleastsomeNodirectsupportavailable,butofcourseonecouldISO-C-bindyourwaythroughitordirectlylinkthelibrariesC++C++(sometimesalsoC)FortranFortranCUDAHIPSYCLOpenACCOpenMPStandardKokkosALPAKAC++FortranC++FortranC++FortranC++FortranC++FortranC++FortranC++FortranC++FortranPythonNVIDIA1234567891011121314151617AMD18192042162223242425262714281629Intel303132333463535363637383914401641•1:CUDAC/C++issupportedonNVIDIAGPUsthroughtheCUDAToolkit•2:CUDAFortran,aproprietaryFortranextension,issupportedonNVIDIAGPUsviatheNVIDIAHPCSDK•3:HIPprogramscandirectlyuseNVIDIAGPUsviaaCUDAbackend;HIPismaintainedbyAMD•4:NosuchthinglikeHIPforFortran,butAMDofersFortraninterfacestoHIPandROCmlibrariesinhipfort•5:SYCLcanbeusedonNVIDIAGPUswithexperimentalsupporteitherinSYCLdirectlyorinDPC++,orviahipSYCL•6:NosuchthinglikeSYCLforFortran•7:OpenACCC/C++supportedonNVIDIAGPUsdirectly(andbest)throughNVIDIAHPCSDK;additional,somewhatlimitedsupportbyGCCCcompilerandinLLVMthroughClacc•8:OpenACCFortransupportedonNVIDIAGPUsdirectly(andbest)throughNVIDIAHPCSDK;additional,somewhatlimitedsupportbyGCCFortrancompilerandFlacc•9:OpenMPinC++supportedonNVIDIAGPUsthroughNVIDIAHPCSDK(albeitwithafewlimits),byGCC,andClang;seeOpenMPECPBoFonstatusin2022.•10:OpenMPinFortransupportedonNVIDIAGPUsthroughNVIDIAHPCSDK(butnotfullOpenMPfeaturesetavailable),byGCC,andFlang•11:pSTLfeaturessupportedonNVIDIAGPUsthroughNVIDIAHPCSDK•12:StandardLanguageparallelfeaturessupportedonNVIDIAGPUsthroughNVIDIAHPCSDK•13:KokkossupportsNVIDIAGPUsbycallingCUDAaspartofthecompilationprocess•14:KokkosisaC++model,butanoficialcompatibilitylayer(FortranLanguageCompatibilityLayer,FLCL)isavailable.•15:AlpakasupportsNVIDIAGPUsbycallingCUDAaspartofthecompilationprocess;also,anOpenMPbackendcanbeused•16:AlpakaisaC++model•17:ThereisavastcommunityofofloadingPythoncodetoNVIDIAGPUs,likeCuPy,Numba,cuNumeric,andmanyothers;NVIDIAactivelysupportsalotofthem,buthasnodirectproductlikeCUDAforPython;so,thestatusissomewhereinbetween•18:hipifybyAMDcantranslateCUDAcallstoHIPcallswhichrunsnativelyonAMDGPUs•19:AMDofersaSource-to-SourcetranslatortoconvertsomeCUDAFortranfunctionalitytoOpenMPforAMDGPUs(gpufort);inaddition,thereareROCmlibrarybindingsforFortraninhipfortOpenACC/CUDAFortranSource-to-Sourcetranslator•20:HIPisthepreferrednativeprogrammingmodelforAMDGPUs•21:SYCLcanuseAMDGPUs,forexamplewithhipSYCLorDPC++forHIPAMD•22:OpenACCC/C++canbeusedonAMDGPUsviaGCCorClacc;also,Intel'sOpenACCtoOpenMPSource-to-SourcetranslatorcanbeusedtogenerateOpenMPdirectivesfromOpenACCdirectives•23:OpenACCFortrancanbeusedonAMDGPUsviaGCC;also,AMD'sgpufortSource-to-SourcetranslatorcanmoveOpenACCFortrancodetoOpenMPFortrancode,andalsoIntel'stranslatorcanwork•24:AMDofersadedicated,Clang-basedcompilerforusingOpenMPonAMDGPUs:AOMP;itsupportsbothC/C++(Clang)andFortran(Flang,example)•25:Intel'sDPC++(oneAPI)canbecompiledwithanexperimentalHIPAMDbackend,allowingtolaunchSTLalgorithmstoAMDGPUs;caveatsfromIntel'sSTLsupportapply•26:Currently,no(known)waytolaunchStandard-basedparallelalgorithmsonAMDGPUs•27:KokkossupportsAMDGPUsthroughHIP•28:AlpakasupportsAMDGPUsthroughHIPorthroughanOpenMPbackend•29:AMDdoesnotoficiallysupportGPUprogrammingwithPython(alsonotsemi-oficiallylikeNVIDIA),butthird-partysupportisavailable,forexamplethroughNumba(currentlyinactive)oraHIPversionofCuPy•30:SYCLomatictranslatesCUDAcodetoSYCLcode,allowingittorunonIntelGPUs;also,Intel'sDPC++CompatibilityToolcantransformCUDAtoSYCL•31:Nodirectsupport,onlyviaISOCbindings,butatleastanexamplecanbefoundonGitHub;it'sprettyscarceandnotbyIntelitself,though•32:CHIP-SPVsupportsmappingCUDAandHIPtoOpenCLandIntel'sLevelZero,makingitrunonIntelGPUs•33:NosuchthinglikeHIPforFortran•34:SYCListheprimeprogrammingmodelforIntelGPUs;actually,SYCLisonlyastandard,whileIntel'simplementationofitiscalledDPC++(DataParallelC++),whichextendstheSYCLstandardinvariousplaces;actuallyactually,IntelnamespaceseverythingoneAPIthesedays,sothefullpropernameisInteloneAPIDPC++(whichincorporatesaC++compilerandalsoalibrary)•35:OpenACCcanbeusedonIntelGPUsbytranslatingthecodetoOpenMPwithIntel'sSource-to-Sourcetranslator•36:IntelhasextensivesupportforOpenMPthroughtheirlatestcompilers•37:IntelsupportspSTLalgorithmsthroughtheirDPC++Library(oneDPL;GitHub).It'sheavilynamespacedandnotyetonthesamelevelasNVIDIA•38:WithInteloneAPI2022.3,IntelsupportsDOCONCURRENTwithGPUofloading•39:KokkossupportsIntelGPUsthroughSYCL•40:Alpakav0.9.0introducesexperimentalSYCLsupport;also,AlpakacanuseOpenMPbackends•41:Notalotofsupportavailableatthemoment,butnotablyDPNP,aSYCL-baseddrop-inreplacementforNumpy,andnumba-dpex,anextensionofNumbaforDPC++.1